Friday, September 23, 2011

Information Retrieval Systems: thoughts on Boolean Search

So I'm taking a challenging (for me) course this term with lots of great subjects for thought. This week we are looking at Boolean Search (AND, OR, NOT statements). The prof wanted to know if we used it and in what situations we don't use it.

I said that I used Boolean when searching a library catalog or database for things that I knew existed, and that I use keyword relevancy searching for things that I don't know whether they exist because Boolean searching is difficult when looking for a document in which there is some doubt as to its existence, or there is doubt in the appropriate terms.

For example, just this week I had a library patron who wanted "the book by Laura Ingels about the demise of culture with the catchy funny title. She's a national commentator and was just interviewed about the book." It turned out that using Google's keyword search "laura funny culture book" (and not a Boolean search in the catalog) was the perfect way to find the book, which was by Laura Ingraham and titled Of Thee I Zing. The Google result which answered the query was an Amazon page of the book which (I guess) pulled together Laura from the author name, culture from the title, and funny from the reviews.

I also looked at another question of the prof's which was about controlled vocabulary and expecting users to know what they are looking for (at least in terms of the specific words they use to search). I think it's fair to expect some level of responsibility on the part of the user for learning the vocabulary of their search need. So if a user is searching a specific microcosm of information (say a health database), I have no problem recommending that the user become as familiar as possible with the language of her search (such as by browsing the controlled vocab/ thesaurus), in order to consciously craft a search using general or specific terms, preferred terms or natural language and understand how the difference in the terms chosen will affect the types of documents and the relevance of the documents returned.

However, I especially like Saracevic's third powerful idea about information science, that of "interaction, enabling direct exchanges and feedback between systems and people engaged in IR processes." (p. 1052) If the system is designed for interaction, then the user might be able to define more precise queries through computer prompting of related keywords. Sort of a brainstorming session with the computer.

Saracevic, T. (1999). "Information science." Journal of the American Society for Information Science. 50(12), 1051-1063.