Bibliometric analysis tools and databases
Most bibliometric studies are based on data covering only international scientific journals indexed in the most widely used commercial citation index databases:
- Web of Science (Clarivate Analytics)
- Scopus (Elsevier).
The databases, however, might give a limited picture on publishing activities in academic disciplines since they largely exclude books, national publications and non-scholarly publications, which are important especially in social sciences and humanities.
In recent years, the OpenAlex database has strengthened its position as a source of publication information. It is a non-commercial instrument that is freely available to all who wish to use it. In addition, it covers a wider range of research publications, such as books, book chapters, preprints, and reports, compared to commercial databases.
Besides the multidisciplinary WoS, Scopus and OpenAlex, several specialist databases, such as PubMed (medicine and the biomedical sciences), Chemical Abstracts, Mathematical Reviews, the ACM Digital Library (computer sciences) and CiteSeer (computer and information sciences), are available online, and most of them provide free access to bibliometric data in their particular field.
The most common databases, also other than Web of Science and Scopus, used for bibliometrics are presented in the following sections.
For a long time, Web of Science (WoS) was the major source for bibliometric analyses. In 2004, Scopus (opens in new tab; click ‘Full text availability’/’Kokotekstin saatavuus’ to access the database) citation index database was launched and is available to subscribers. Scopus database is to some extent more comprehensive than WoS, and it covers a wider range of journals. Scopus offers about 20% more coverage for citation analysis than WoS, but it only includes publications from 1966 onwards and its citation data are limited to publications from 1996 onwards, whereas WoS goes back to 1900. Scopus has better coverage in the field of engineering in particular, but in the social sciences and humanities its coverage is weak. In addition to scientific journals, Scopus also indexes books and conference papers.
Watch Scopus training videos e.g.
Scopus Tutorial: How to search for an author and view their profile (2:44)
Scopus Tutorial: How to assess an author’s impact (3:38)
Scopus Tutorial: How to View Document Metrics in Scopus (3:36)
Scopus Tutorial: How to View Journal Metrics for a Title in Scopus (3:31)
Scopus Tutorial: How to Create Citation Overviews in Scopus (2:47)
Practise to use Scopus:
- Search articles of your research area in Scopus Document search (access Scopus via UEF-Primo -> click the ‘Full text availability’/’Kokotekstin saatavuus’ link). Tips for information retrieval can be found at the 1. Module: the Basics of information retrieval or Scopus Help.
- Select articles you find interesting and create a citation overview of those articles in Scopus.
- Open one article you consider potentially useful. How many times the article has been cited? What is its field-weighted citation impact?
- Find information about the corresponding author by clicking the author’s name in the article. Analyze the author’s output: What’s the author’s h-index? How many citations does she/he have?
- Search for yourself or your supervisor/professor via the Scopus Author search. Again, find the author’s h-index and number of citations.
- In Scopus, you can find Plum X Metrics offering detailed information about the visibility (including mentions in social media) of the article. Open document details and check what kind of attention an article of your choice has received?
Further information:
Finnish national guide to publication metrics: Data sources: Scopus.
Web of Science (WoS) (click ‘Full text availability’/’Kokotekstin saatavuus’ to access the database), a classic database originating to the 1960s, is a pioneer of bibliometrics. WoS compiles the most widely used citation indexing databases:
- the Science Citation Index (SCI)
- the Social Science Citation Index (SSCI)
- the Arts and Humanities Citation Index (A&HCI).
WoS is available to subscribers and it covers journals from all disciplines. However, it does not include national publications, books or non-scholarly material. Thus, the coverage of WoS varies widely between the disciplines. The most comprehensive coverage of publications is given in the natural and medical sciences, but account for only a small fraction of publications in the humanities and social sciences, where publishing is oriented towards national publications and books. In engineering, where conference proceedings dominate, coverage by WoS is moderate.
WoS provides the possibility to search for publications, citations and h-indexes. Search results can also be analysed in the same way as within the Scopus database. The use of Web of Science is significantly hindered by the fact that the author and institution names have not been standardised. For example, a Finnish author with the letters ä or ö in their family name may be found in the database with more than ten different spelling variations.
See instructions how to find h-index in Web of Science (see: h-Index -> Web of Science).
Further information:
Clarivate. 2023. Citation Reports.
Finnish national guide to publication metrics: Data sources: Web of Science.
Web of Science Training. 2017. Video (2:59): Journal Citation Reports: Immediacy Index.
Web of Science Training. 2017. Video (5:21): Web of Science – Citation Sources.
Web of Science Training. 2017. Video (3:48): Web of Science – Usage Counts.
OpenAlex is an open catalog of publications, authors, journals, institutions, concepts, and the connections between them. It was released in January 2022 to replace the discontinued Microsoft Academic Graph (MAG). OpenAlex is a tool by the non-profit OurResearch, funded by Arcadia — a charitable fund of Lisbet Rausing and Peter Baldwin. In addition to OpenAlex, OurResearch has built other tools for open science, such as unpaywall.
At the heart of OpenAlex is a dataset—a catalog of works. A work is any sort of scholarly output—research article is one kind of work, but there are others such as datasets, books, and dissertations. In January 2024, the coverage of OpenAlex is almost 250M works (out of which 52M Open Access) having 1.9B citations written by 90M authors. Two most important data sources are MAG and Crossref, but there are several other key sources including:
- ORCID
- ROR
- DOAJ
- Unpaywall
- Pubmed
- Pubmed Central
- The ISSN International Centre
- Internet Archive
- Web crawls
- Subject-area and institutional repositories from arXiv to Zenodo and many in between
OpenAlex offers three different options for information retrieval:
- OpenAlex Web — Web user interface
- OpenAlex API — A REST API to get the data programmatically
- Data Snapshot — A periodic snapshot of the data, available to download in its entirety, for free
About the OpenAlex entities:
As mentioned earlier, OpenAlex has a catalog of works, i.e. scholarly outputs. OpenAlex keeps track of these works—their titles (and abstracts and full text in many cases), when they were created, etc. In addition, they keep track of the connections between these works, finding associations through things such as journals, authors, affiliations, citations, topics, and funders. Let’s have a look at the different entities.
- Works: Scholarly documents like journal articles, books, datasets, and theses.
- Authors: People who create works.
- Sources: Where works are hosted (such as journals, conferences, and repositories)
- Institutions: Universities and other organizations to which authors claim affiliations
- Concepts: Topics assigned to works
- Publishers: Companies and organizations that distribute works
- Funders: Organizations that fund research
- Geo: Where things are in the world
Watch OpenAlex Tutorial and check OpenAlex help center for more information.
Further information:
OurResearch. 2024. OpenAlex help center.
Finnish national guide to publication metrics: Tools: OpenAlex.
Journal citation reports (JCR) (click ‘Full text availability’/’Kokotekstin saatavuus’ to access the database) is the only official source for impact factor (IF) values, produced by Clarivate Analytics. JCR contains impact factor values and other indicators of all journals included in the Web of Science database. Clarivate Analytics owns the impact factor values, and no other party is allowed to publish them at least carrying the same title. JCR can be used as a part of the Web of Science database.
Guidance for finding a journal’s impact factor in the JCR database (see: IF (Impact Factor) -> How to find an Impact Factor number).
Further information:
Clarivate Analytics. 2017. Journal Citation Reports: Learn the Basics.
Finnish national guide to publication metrics: Analysis tools: Journal Citation Reports.
Web of Science Training. 2017. Video (5:16): Journal Citation Reports – Journal Impact Factor.
Google Scholar (GS) citations / author profile
Google Scholar citations provide a simple way to keep track of citations to author’s publications. Researchers can create a Google Scholar profile (gmail account) and add there all their publications found through the Google Scholar search. My citations link (“omat lainaukset” in Finnish) shows the list of publications, the number of times the publications have been cited, h-index and i10-index (the number of publications with over 10 citations). Author profile example.
The profile can be public or private. When searching GS by an author, the public profile will be at the top of the results.
Compared to Web of Science and Scopus, Google Scholar covers more books, book chapters, theses, conference proceedings, technical reports and other publication types. It also has a wider range of journals and publications in other languages than English. Even so, the coverage of Google Scholar is uncontrolled, and there is no list of all publications included. The data are obtained from academic publishers, professional societies, online repositories, universities and other websites. Google Scholar does not separate non-scientific material (such as seminar presentations, working papers or master’s and bachelor’s theses) from scientific, scholarly literature. There is neither information available about the time span.
Google Scholar is also easy to manipulate. Read: The Google Scholar Experiment: how to index false papers and manipulate bibliometric indicators / Emilio Delgado López-Cózar, Nicolás Robinson-García. (Also part of the assignment of the course module.)
Google Scholar generally provides a higher citation count than Web of Science or Scopus, especially in the fields of social sciences, arts and humanities. The natural and health sciences are better covered in those databases. Google Scholar as a source for citation information differs significantly from scientific citation databases. Google Scholar citations are not always from scholarly journals or books.
How to set up your Google Scholar profile (Slideshare)
Publish or Perish (PoP)
Publish or Perish is a free software program which uses Google Scholar citation data to calculate the impact metrics including h-index and others. The main focus is on author’s impact, but it is also possible to examine the impact of a journal or analyse an individual article.
The program is developed by Dr. Anne-Wil Harzing of the University of Melbourne. It must be downloaded from Harzing’s web pages (no need for administrator rights).
Because it uses Google Scholar data, the limitations are the same. It is good to keep in mind the undefined contents of Google Scholar and be critical about the quality and reliability of citation information retrieved. Many citations come from non-scholarly publications. Also self-citations are included. The output of Publish or Perish should not be compared with the output of Web of Science or Scopus. GS as a source for scientific citation information is inaccurate and undefined. Instead GS citation data tells more about the societal impact of publications.
How to find author’s citation data using Publish or Perish:
Publish or Perish mines all Google Scholar data and will present data on any authors – not only those with a public Google Scholar profile.
Cleaning up the results list:
It is possible that there are publications by another author with the same name or non-scholarly publications which you may not wish to include. Uncheck the boxes next to those records which you don’t want to be calculated to the results.
Often there are also duplicate entries to the same publication. Merge the duplicates by dragging one record on top of the other. Be careful to choose the most accurate one as the main reference.
The results are displayed in order of the citation count. You can change the order by clicking any of the column headings. If you have a long list of publications, it is useful to sort the results by title or year.
Behind the copy button you can export your list to Excel etc.
What if you have a very common author name? How to separate the works from those of “confuser” authors? Often it is good to have your publication list on hands. Sometimes the only way is to search for publications one by one and export those to excel (copy button on PoP main page). Maybe it would be good to create a profile of your own.
Further information:
Finnish national guide to publication metrics: Data sources: Google Scholar.
Finnish national guide to publication metrics: Analysis tools: Publish or Perish.
H-index in Publish or Perish / University of Turku
– Please, scroll down to see the instructions.
Publish of Perish 6 manual / Harzing.com
– Related topics also
Impact of social sciences blog / London School of Economics and Political Science (LSE)
– Good blog for following the discussion especially for the researcher in the fields of social sciences.
Publication Forum (i.e. Julkaisufoorumi, often referred to as JUFO in Finnish) is a rating and classification system of scientific publication channels created by the Finnish scientific community to support the quality assessment of academic research. To account for the different publication cultures characteristic of various disciplines, the classification includes academic journals, book series, conferences as well as book publishers.
In publication channel search, you can search for journals, series, conferences and book publishers that have a Publication Forum (JUFO) rating. The three-level classification evaluated by 23 discipline-specific expert panels rates the major foreign and domestic publication channels of all disciplines as follows:
1 = basic level
2 = leading level
3 = highest level
- Other identified publication channels which have not received level 1 rating are marked with 0.
- If there is no marking, the publication channel in question is under evaluation, and yet without a rating.
The Publication Forum classification was created to respond to the need to evaluate the research output of universities and other institutes not only quantitatively but also qualitatively. Since 2015, the classification has been used as a quality indicator of the research output produced by universities within the university funding model established by the Ministry of Education and Culture.
The JUFO classification provides information on the impact of academic publication channels and the appreciation they enjoy within the scientific community. The objective of the Publication Forum is to encourage Finnish scholars and researchers to publish their research outcomes in high-level domestic and foreign forums. However, the level awarded to a publication channel mirrors the average level of its published articles, and thus it is rational to use the classification to evaluate large publication volumes only. The classification is not suited for the evaluation of the merits of an individual researcher, nor can it replace an assessment made by experts in a specific field e.g. in recruitment situations.
Practise to use JUFO:
- Check the experts of the JUFO Panel of your discipline. Can you find any familiar panelists?
- Can you find level 3 journals of your discipline, where you could possibly publish?
- Can you find a potential journal for your article, which accepts open access? What is its publication forum level and open access type?
Further reading:
Finnish national guide to publication metrics: Analysis tools: Publication Forum.
UEF CRIS
Scientific publications produced by higher education institutions, research institutions and university hospitals must be yearly reported to the Ministry of Education and Culture. The purpose of the Ministry of Education and Culture for collecting publication data is to get a knowledge base on the research activities and the social impact of the Finnish research system. The Ministry uses the publication data collected from higher education institutions for calculating the basic funding allocated to universities and universities of applied sciences, but also for otherwise monitoring research and development activities.
Statistics on the number of publications are available in the Vipunen statistics service (see below) of the Finnish National Board of Education. The publication data collected can be openly viewed in the JUULI publication data portal (see below). In addition, the national information resource, VIRTA publication data service, will make the information produced by publications within the Finnish research system available for other services.
At UEF, researchers report their publications to UEF CRIS (more information in 4. module: Publishing and Research Visibility). In UEF CRIS, you may explore the UEF experts, their publications, projects and scientific activities.
Research.fi (Tiedejatutkimus.fi)
Research.fi (Tiedejatutkimus.fi) is a service offered by the Ministry of Education and Culture that collects and shares information on research conducted in Finland. The service improves the location of information and experts on research and increases the visibility and societal impact of Finnish research. The service was launched in June 2020 and will be developed in phases.
University-specific publication data can be browsed in the research information systems of each university.
VIPUNEN
Statistical analyses on publication data can be obtained from Vipunen reporting portal. It contains not only data on publications but another information related to the operations of the research universities and universities of applied sciences in Finland.
Higher education and r&d activity -> Bibliometrics (only on Finnish web pages).
Further reading:
Finnish national guide to publication metrics: Data sources and analysis tools.
(8/2024 JN)
< Previous page: Bibliometric indicators
Next page: University rankings >