• Corpus ID: 216056626

A Deeper Investigation of the Importance of Wikipedia Links to the Success of Search Engines

  title={A Deeper Investigation of the Importance of Wikipedia Links to the Success of Search Engines},
  author={Nicholas Vincent and Brent J. Hecht},
A growing body of work has highlighted the important role that Wikipedia's volunteer-created content plays in helping search engines achieve their core goal of addressing the information needs of millions of people. In this paper, we report the results of an investigation into the incidence of Wikipedia links in search engine results pages (SERPs). Our results extend prior work by considering three U.S. search engines, simulating both mobile and desktop devices, and using a spatial analysis… 

Figures and Tables from this paper

Wikipedia can help resolve information inequality in the aquatic sciences
Dustin W. Kincaid ,* Whitney S. Beck , Jessica E. Brandt, Margaret Mars Brisbin , Kaitlin J. Farrell , Kelly L. Hondula , Erin I. Larson , Arial J. Shogren 8 Vermont EPSCoR, University of Vermont,
Data Leverage: A Framework for Empowering the Public in its Relationship with Technology Companies
Drawing on prior work in areas including machine learning, human-computer interaction, and fairness and accountability in computing, a framework for understanding data leverage is presented that highlights new opportunities to change technology company behavior related to privacy, economic inequality, content moderation and other areas of societal concern.


Measuring the Importance of User-Generated Content to Search Engines
A rigorous audit of the extent to which Google leverages Wikipedia and other user-generated content to respond to queries shows that Wikipedia appears in over 80% of results pages for some query types and is by far the most prevalent individual content source across all query types.
The Substantial Interdependence of Wikipedia and Google: A Case Study on the Relationship Between Peer Production Communities and Information Technologies
Evidence is found that Google’s critical role in providing readership to Wikipedia is in jeopardy and researchers and practitioners should give deeper consideration to the interdependence between peer production communities and the information technologies that use and surface their content.
Investigating the Effects of Google's Search Engine Result Page in Evaluating the Credibility of Online News Sources
A study in which participants were instructed to do lateral reading for credibility assessment by inspecting Google's search engine result page (SERP) of unfamiliar news sources, and indicates that there are widespread inconsistencies in the coverage and quality of information included in Knowledge Panels.
How the Interplay of Google and Wikipedia Affects Perceptions of Online News Sources
Two user studies are presented in which participants were asked to make assumptions about the credibility of a news source based only on its Google SERP, and it is suggested that the presence of Knowledge Panel features is perceived to be important to participants’ credibility determinations.
Enhancing web search in the medical domain via query clarification
The utility of bridging the gap between layperson and expert vocabularies is investigated and the approach adds the most appropriate expert expression to queries submitted by users, a task the authors call query clarification.
Auditing the Personalization and Composition of Politically-Related Search Engine Results Pages
A targeted algorithm audit of Google Search is conducted using a dynamic set of political queries to find significant differences in the composition and personalization of politically-related SERPs by query type, subjects» characteristics, and date.
Measuring personalization of web search
A methodology for measuring personalization in Web search results is developed and it is found that, on average, 11.7% of results show differences due to personalization, but that this varies widely by search query and by result ranking.
Determining the user intent of web search engine queries
This paper qualitatively analyzes samples of queries from seven transaction logs from three different Web search engines containing more than five million queries and identifies characteristics of user queries based on three broad classifications of user intent.
Location, Location, Location: The Impact of Geolocation on Web Search Personalization
This paper proposes a novel methodology to explore the impact of location-based personalization on Google Search results, and observes that differences in search results due to personalization grow as physical distance increases.
Beyond ten blue links: enabling user click modeling in federated web search
The proposed novel federated click model (FCM) can outperform other click models in interpreting user click behavior in federated search and achieve significant improvements in terms of both perplexity and log-likelihood.