Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases

@article{Gusenbauer2018GoogleST,
  title={Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases},
  author={Michael Gusenbauer},
  journal={Scientometrics},
  year={2018},
  volume={118},
  pages={177-214}
}
Information on the size of academic search engines and bibliographic databases (ASEBDs) is often outdated or entirely unavailable. Hence, it is difficult to assess the scope of specific databases, such as Google Scholar. While scientometric studies have estimated ASEBD sizes before, the methods employed were able to compare only a few databases. Consequently, there is no up-to-date comparative information on the sizes of popular ASEBDs. This study aims to fill this blind spot by providing a… Expand
SEforRA: A Bibliometrics-ready Academic Digital Library Search Engine Alternative
TLDR
SEforRA extracts and processes data from CrossRef, publishers, and other sources to provide an integrated platform for researchers to search and retrieve publication metadata, which is ready to use further in their research. Expand
Web of Science (WoS) and Scopus: The Titans of Bibliographic Information in Today's Academic World
TLDR
An all-inclusive description of the two main bibliographic DBs by gathering the findings that are presented in the most recent literature and information provided by the owners of the DBs at one place is provided. Expand
Irreproducibility in searches of scientific literature: A comparative analysis
TLDR
A comparative analysis of time‐synchronized searches at different institutional locations in the world is presented and it is revealed a large variation among search platforms and showed that PubMed and Scopus returned consistent results to identical search strings from different locations. Expand
ResearchGate and Google Scholar: How much do they differ in publications, citations and different metrics and why?
TLDR
There are significantly high differences in publication counts and citations for the same authors in the two platforms, with Google Scholar having higher counts for a vast majority of the cases. Expand
Universities through the eyes of bibliographic databases: a retroactive growth comparison of Google Scholar, Scopus and Web of Science
TLDR
This work proves that the url-based method to calculate institutional productivity in GS is not a good proxy for the total number of publications indexed in WoS and Scopus, at least in the national context analyzed. Expand
Comprehensiveness and uniqueness of commercial databases and open access systems
TLDR
The study reveals that search engine tend to provide more resources than do commercial databases but also that commercial databases have better coverage than institutional repositories. Expand
In-text citation’s frequencies-based recommendations of relevant research papers
TLDR
The evaluation results indicate that in-text citation frequency has attained higher precision in finding relevant papers than other state-of-the-art techniques such as content, bibliographic coupling, and metadata-based techniques. Expand
Which academic search systems are suitable for systematic reviews or meta‐analyses? Evaluating retrieval qualities of Google Scholar, PubMed, and 26 other resources
TLDR
The study is the first to show the extent to which search systems can effectively and efficiently perform (Boolean) searches with regards to precision, recall, and reproducibility and to demonstrate why Google Scholar is inappropriate as principal search system. Expand
Comparison of bibliographic data sources: Implications for the robustness of university rankings
TLDR
Detailed bibliographic comparisons between three key databases are performed and it is suggested that robust evaluation measures need to consider the effect of choice of data sources and recommend an approach where data from multiple sources is integrated to provide a more robust dataset. Expand
NLP Scholar: A Dataset for Examining the State of NLP Research
TLDR
The NLP Scholar Dataset is presented – a single unified source of information (from both AA and Google Scholar) for tens of thousands of NLP papers that can be used to identify broad trends in productivity, focus, and impact of N LP research. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 75 REFERENCES
Is Google Scholar useful for bibliometrics? A webometric analysis
TLDR
A novel approach is introduced to check the usefulness of this database for bibliometric analysis, and especially research evaluation, instead of names of authors or institutions, a webometric analysis of academic web domains is performed. Expand
Methods for estimating the size of Google Scholar
TLDR
Three empirical methods are presented, apply and discussed: an external estimate based on empirical studies of Google Scholar coverage, and two internal estimate methods based on direct, empty and absurd queries, respectively, which place the estimated size of Google scholar at around 160–165 million documents. Expand
Google Scholar as a source for scholarly evaluation: A bibliographic review of database errors
TLDR
The results indicate that the bibliographic corpus dedicated to errors in Google Scholar is still very limited, excessively fragmented, and diffuse; the findings have not been based on any systematic methodology or on units that are comparable to each other, so they cannot be quantified, or their impact analysed, with any precision. Expand
Google Scholar as a data source for research assessment
TLDR
It is concluded that Google Scholar presents a broader view of the academic world because it has brought to light a great amount of sources that were not previously visible. Expand
Is Google enough? Comparison of an internet search engine with academic library resources
TLDR
A novel form of relevance assessment, based on the work of Saracevic and others was devised, in order to assess the relative value, strengths and weaknesses of the two sorts of system. Expand
A Technique for Measuring the Relative Size and Overlap of Public Web Search Engines
TLDR
A standardized, statistical way of measuring search engine coverage and overlap through random queries is described that can be implemented by third-party evaluators using only public query interfaces and suggests the size of the static, public Web as of November was over 200 million pages. Expand
Suitability of Google Scholar as a source of scientific information and as a source of data for scientific evaluation - Review of the Literature
TLDR
The results show that GS has significantly expanded its coverage through the years which makes it a powerful database of scholarly literature, however, the quality of resources indexed and overall policy still remains known. Expand
An exploratory study of Google Scholar
TLDR
The study shows deficiencies in the coverage and up‐to‐dateness of the GS index and points out which web servers are the most important data providers for this search service and which information sources are highly represented. Expand
A Comparison between Two Main Academic Literature Collections: Web of Science and Scopus Databases
Nowadays, the world’s scientific community has been publishing an enormous number of papers in different scientific fields. In such environment, it is essential to know which databases are equallyExpand
Can we use Google Scholar to identify highly-cited documents?
TLDR
Evidence is found that Google Scholar ranks those documents whose language (or geographical web domain) matches with the user’s interface language higher than could be expected based on citations, however, this language effect and other factors related to the Google Scholar operation only have an incidental impact. Expand
...
1
2
3
4
5
...