Does Google Scholar contain all highly cited documents (1950-2013)?

Abstract

The study of highly cited documents on Google Scholar (GS) has never been addressed to date in a comprehensive manner. The objective of this work is to identify the set of highly cited documents in Google Scholar and define their core characteristics: their languages, their file format, or how many of them can be accessed free of charge. We will also try to answer some additional questions that hopefully shed some light about the use of GS as a tool for assessing scientific impact through citations. The decalogue of research questions is shown below: 1. Which are the most cited documents in GS? 2. Which are the most cited document types in GS? 3. What languages are the most cited documents written in GS? 4. How many highly cited documents are freely accessible? a. What file types are the most commonly used to store these highly cited documents? b. Which are the main providers of these documents? 5. How many of the highly cited documents indexed by GS are also indexed by WoS? 6. Is there a correlation between the number of citations that these highly cited documents have received in GS and the number of citations they have received in WoS? 7. How many versions of these highly cited documents has GS detected? 8. Is there a correlation between the number of versions GS has detected for these documents, and the number citations they have received? 9. Is there a correlation between the number of versions GS has detected for these documents, and their position in the search engine result pages? 10. Is there some relation between the positions these documents occupy in the search engine result pages, and the number of citations they have received? To answer these questions, a set of 64,000 documents indexed in Google Scholar has been collected, after performing 64 queries by year (from 1950 to 2013) using Google Scholar’s advanced search, and collecting the maximum number of records that GS displays for any given query, which as we know is always 1,000. These 64,000 documents receive 122,245,865 citations in Google Scholar and 35,182,077 in Web of Science Core Collection. Full raw data available at: http://dx.doi.org/10.6084/m9.figshare.1224314

Extracted Key Phrases

14 Figures and Tables

0204060201520162017
Citations per Year

Citation Velocity: 17

Averaging 17 citations per year over the last 3 years.

Learn more about how we calculate this metric in our FAQ.

Cite this paper

@article{MartnMartn2014DoesGS, title={Does Google Scholar contain all highly cited documents (1950-2013)?}, author={Alberto Mart{\'i}n-Mart{\'i}n and Enrique Ordu{\~n}a-Malea and Juan Manuel Ayllon and Emilio Delgado L{\'o}pez-C{\'o}zar}, journal={CoRR}, year={2014}, volume={abs/1410.8464} }