Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic

@article{Visser2020LargescaleCO,
  title={Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic},
  author={Martijn S. Visser and Nees Jan van Eck and Ludo Waltman},
  journal={Quantitative Science Studies},
  year={2020},
  pages={1-22}
}
We present a large-scale comparison of five multidisciplinary bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic. The comparison considers scientific documents from the period 2008–2017 covered by these data sources. Scopus is compared in a pairwise manner with each of the other data sources. We first analyze differences between the data sources in the coverage of documents, focusing for instance on differences over time, differences per document… 
Comparative Analysis of the Bibliographic Data Sources Dimensions and Scopus: An Approach at the Country and Institutional Levels
TLDR
It is found that close to half of all documents in Dimensions are not associated with any country of affiliation while the proportion of documents without this data in Scopus is much lower, which affects the possibilities that Dimensions can offer as instruments for carrying out bibliometric analyses at the country and institutional level.
Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations’ COCI: a multidisciplinary comparison of coverage via citations
TLDR
This paper investigates 3,073,351 citations found by these six data sources to 2,515 English-language highly-cited documents published in 2006 from 252 subject categories, expanding and updating the largest previous study.
Web of Science (WoS) and Scopus: The Titans of Bibliographic Information in Today's Academic World
TLDR
An all-inclusive description of the two main bibliographic DBs by gathering the findings that are presented in the most recent literature and information provided by the owners of the DBs at one place is provided.
Open access at the national level: A comprehensive analysis of publications by Finnish researchers
TLDR
Institutional data, integrated at national and international level, provides one of the building blocks of a large-scale data infrastructure needed for comprehensive assessment and monitoring of OA across countries, for example at the European level.
Growth rates of modern science: a latent piecewise growth curve approach to model publication numbers from established and new literature databases
TLDR
This study analyzed scientific growth in two broad fields and the relationship of scientific and economic growth in UK and estimated regression models that included simultaneously the publication counts from the four databases to investigate scientific growth processes from the beginning of the modern science system until today.
Scaling Scientometrics: Dimensions on Google BigQuery as an Infrastructure for Large-Scale Analysis
TLDR
A novel visualisation technique is introduced and used as a means to explore the potential for scaling scientometrics by democratising both access to data and compute capacity using the cloud.
Which aspects of the Open Science agenda are most relevant to scientometric research and publishing? An opinion paper
TLDR
The aspects of Open Science that are most relevant for scientometricians are considered, discussing how they can be usefully applied.
A Glimpse of the First Eight Months of the COVID-19 Literature on Microsoft Academic Graph: Themes, Citation Contexts, and Uncertainties
  • Chaomei Chen
  • Computer Science
    Frontiers in Research Metrics and Analytics
  • 2020
TLDR
A generic method is introduced that facilitates the data collection and sense-making process when dealing with a rapidly growing landscape of a research domain such as COVID-19 at multiple levels of granularity.
Do Online Readerships Offer Useful Assessment Tools? Discussion Around the Practical Applications of Mendeley Readership for Scholarly Assessment
This methods report illustrates the relevance of Mendeley readership as a tool for research assessment. Readership indicators offer new possibilities to inform the evaluation of publications and
COVID-19 enabled co-authoring networks: a country-case analysis
TLDR
Results suggest that the affiliated institutional sectors such as the Higher Education Sector (HES) and the Government Sector (GOV) contributed the most in terms of scientific output.
...
...

References

SHOWING 1-10 OF 58 REFERENCES
The Journal Coverage of Web of Science, Scopus and Dimensions: A Comparative Analysis
TLDR
The results indicate that the databases have significantly different journal coverage, with the Web of Science being most selective and Dimensions being the most exhaustive.
Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations’ COCI: a multidisciplinary comparison of coverage via citations
TLDR
This paper investigates 3,073,351 citations found by these six data sources to 2,515 English-language highly-cited documents published in 2006 from 252 subject categories, expanding and updating the largest previous study.
Coverage of highly-cited documents in Google Scholar, Web of Science, and Scopus: a multidisciplinary comparison
TLDR
The main conclusion is that the data about highly-cited documents available in the inclusive database Google Scholar does indeed reveal significant coverage deficiencies in Web of Science and Scopus in several areas of research.
The journal coverage of Web of Science and Scopus: a comparative analysis
TLDR
Results indicate that the use of either WoS or Scopus for research evaluation may introduce biases that favor Natural Sciences and Engineering as well as Biomedical Research to the detriment of Social Sciences and Arts and Humanities.
Dimensions: re-discovering the ecosystem of scientific information
TLDR
It is concluded that Dimensions is an alternative for carrying out citation studies, being able to rival Scopus (Greater coverage and free of charge) and with Google Scholar (greater functionalities for the treatment and data export).
Two new kids on the block: How do Crossref and Dimensions compare with Google Scholar, Microsoft Academic, Scopus and the Web of Science?
TLDR
Overall, this first small-scale study suggests that, when compared to Scopus and the Web of Science, Crossref and Dimensions have a similar or better coverage for both publications and citations, but a substantively lower coverage than Google Scholar and Microsoft Academic.
Comparing bibliometric country-by-country rankings derived from the Web of Science and Scopus: the effect of poorly cited journals in oncology
TLDR
It is found that the oncological journals in Scopus not covered by WoS tend to be nationally oriented journals, i.e. they mainly serve a national research community, and play as of yet a more peripheral role in the international journal communication system.
Scopus as a curated, high-quality bibliometric data source for academic research in quantitative science studies
TLDR
The trustworthiness of Scopus has led to its use as bibliometric data source for large-scale analyses in research assessments, research landscape studies, science policy evaluations, and university rankings.
Comparison of bibliographic data sources: Implications for the robustness of university rankings
TLDR
Detailed bibliographic comparisons between three key databases are performed and it is suggested that robust evaluation measures need to consider the effect of choice of data sources and recommend an approach where data from multiple sources is integrated to provide a more robust dataset.
...
...