Artificial intelligence in information retrieval
AI has been used as a tool in databases and search engines, as well as in the algorithms that control their operations, for a long time. However, Large Language Models (LLM) has made AI a visible part of traditional databases, too.
Language models use the law of averages
Large language models are trained with huge mass of text. An application utilising language model is not search engine itself, but it generates an answer based on on the probability of consecutive occurrence of words, among other things.
Combined language model and search is more reliable
Most generative AI applications can utilise genuine sources. Retrieval-augmented generation (RAG) is a technique, which combines a language model and information retrieved from an external source. Search can be done from the Internet, from a certain database or from organisation’s own documents, for example.
The answer by AI is thus a combination of text generated by a trained language model and information of real publications.
Applications utilising language models can be used to help information retrieval
Generative AIs using language models can be used to assist information retrieval in many ways. It can be used to get to know the topic you are looking for and it can be used to for asking suitable keywords and even queries for databases. Some AI applications are suitable for searching sources directly, too. A search is done with natural language, without formal queries.
The answer is not always the same, even if you question is exactly the same. It is always up to you to evaluate the validity of answers.
Be careful with data privacy and security! Generative AI applications should not be given sensitive or confidential information.
Generative AI can suggest search terms
You can ask AI for suitable search terms for your topic. If the suggested search terms seem very general or obvious, ask again for more specific words.
Often, instead of individual words, you will receive longer phrases as an answer. When you edit these phrases for purposes of search queries, break down the phrases into separate terms and add operators between them.
Formulating search queries is partially successful
Some applications can also formulate search queries suitable for databases when asked. The operators and phrases usually are marked correctly (not always!), but word truncation is missing. Additionally, individual search terms are often enclosed in quotation marks, which generally weakens the functionality of automatically find different forms or spellings of words.
AI can suggest search terms that are too general for the topic. You can add some restricting terms to the query, or you can remove terms that return irrelevant results.
It is common that search term suggestions are phrases. If you ask, the application can remove the quotation marks, but it cannot reorganise the terms with operators.
AI in databases and information retrieval
Traditional databases now include generative AI tools based on language models, too.
- In Web of Science database, the AI tool is called Research Assistant.
- In UEF Primo International articles search similar tool, AI Research Assistant, is available.
- In Scopus database the tool is named Scopus AI.
The database’s AI tool uses real articles as its sources, which it retrieves from the database itself. However, only abstracts of articles can be used, so information provided is limited. Based on the abstracts and language models, the program creates a summary of the topic. The search and the resulting summary are only a quick overview of the topic, not a thorough or exhaustive answer. The search can easily be continued with more specific questions, though.
Many separate AI applications can already search for real sources on which they base their answers. The search can target the open web or specifically only openly published scientific articles, for example.
Not all questions may have a “good” answer, even when true sources are utilised. An AI application may still provide an answer with help of a language model, and it may refer to sources. The sources must always be checked in order to see which information is based on true sources, which is generated by a language model alone.
Read more about the topic on the library’s homepage: AI in information retrieval. The page provides more details on which applications and how to use them in information retrieval.
Next page: Reliability of results