Brain Dump

Boolean Retrieval

Tags
text-processing

An information retrieval model which classifies each document in an index as either relevant or irrelevant to the query. With the basic idea that if a document contains the search term, it's relevant otherwise it isn't.

Supports see page 46, boolean operators but they function as set operations.

This approach is great for producing high precision queries but requires expert knowledge to create them. It's not good for the vast majority of users because:

  • it's unnatural to write boolean queries (thinking in collections 🙃).
  • you'll have to wade through 1000s unranked results (except for very specific queries).

Links to this note