Search strategy formulation

As a searcher of literature or information, you are likely to wish to find all the relevant information you need without being confused by all the irrelevant information available. This idea is described with the concepts recall and precision.

Recall an precision as tools to adjust result sets

Recall

Recall (also called sensitivity) refers to the proportion of relevant results retrieved compared to all the relevant items there are in the whole database. Basically, you wish to maximise the quantity of relevant results.

Number of relevant items retrieved / Number of all relevant items in the database.

Precision

Precision (also called specificity) stands for the proportion of relevant results compared to all the actual results you get with your query. In general, it is easier to work further if the quality of the results is high and you don’t need to spend time sieving through a lot of irrelevant information.

Number of relevant items retrieved / Number of all items in search result.

So, you are looking for both high recall and high precision, but unfortunately they often tend to be inversely proportional to each other.

In a comprehensive search (high recall), you will get also irrelevant results (low precision); and vice versa.

Coordinate axes. X-axe is precision, y-axe is recall. The graph shows a high recall when precision is low and a low recall when precision is high.
Relation of recall and precision.

The reason for this phenomenon is that a search query can be seen as a simplified model of the actual search topic, consisting of keywords and operators describing relationships between them. This model represents the topic incompletely. Another reason is that even scientific language is not exact. The keywords may be inaccurate and have multiple meanings. The search engines don’t understand any meanings, but instead search for character strings only.

High recall is achieved by using

  • comprehensively all the possible terms and synonyms related to each concept of the topic, even if they are not very specific (= number or OR-operations in a query)
  • less AND-operations that limit the result
  • all search fields, instead of selecting just the title or keywords
  • an AND-operator (or a proximity operator) instead of a phrase

High precision is achieved by using

  • terms describing the topic accurately, avoiding terms with multiple meanings
  • more AND-operations for focusing the topic
  • the search fields that are most informative about the actual contents: title and/or subject terms
  • phrases or proximity operators whenever it is suitable

Decide for yourself the goal of the search

Whether you should emphasise recall or precision depends on the aims of your search. For instance, if you are doing a literature search for systematic review, then recall is the more important one. Read more about systematic review from the next chapter. Whereas if you want to save time and find the most relevant publications, then precision comes first.

Finding the right balance between these two is one of the challenges in information searching. Keep in mind that your needs may alternate from time to time. When you are just starting your searching, it is useful to gather some highly relevant publications, but later there might be a need to extend the scope.

Some special cases of searching

There is no just one way to start or carry on the searching. Here are some examples:

Back to the page: 2. Module