Bias in Wikipedia

  title={Bias in Wikipedia},
  author={Christoph Hube},
  journal={Proceedings of the 26th International Conference on World Wide Web Companion},
  • C. Hube
  • Published 3 April 2017
  • Computer Science
  • Proceedings of the 26th International Conference on World Wide Web Companion
While studies have shown that Wikipedia articles exhibit quality that is comparable to conventional encyclopedias, research still proves that Wikipedia, overall, is prone to many different types of Neutral Point of View (NPOV) violations that are explicitly or implicitly caused by bias from its editors. Related work focuses on political, cultural and gender bias. We are developing an approach for detecting both explicit and implicit bias in Wikipedia articles and observing its evolution over… 
Wikipedia citations: A comprehensive data set of citations with identifiers extracted from English Wikipedia
Wikipedia Citations, a comprehensive data set of citations extracted from Wikipedia, finds that 6.7% of Wikipedia articles cite at least one journal article with an associated DOI, and that Wikipedia cites just 2% of all articles with a DOI currently indexed in the Web of Science.
Approaches for Enriching and Improving Textual Knowledge Bases
This thesis addresses the aforementioned issues and proposes automated approaches that enforce the verifiability principle in Wikipedia, and suggests relevant and missing news references for further enriching Wikipedia entity pages.
BelElect: A New Dataset for Bias Research from a "Dark" Platform
New social networks and platforms such as Telegram, Gab and Parler offer a stage for extremist, racist and aggressive content, but also provide a safe space for freedom fighters in authoritarian
Reverting Hegemonic Ideology: Research Librarians and Information Professionals as “Critical Editors” of Wikipedia
While many LIS publications have focused on Wikipedia, no LIS study has used intersectional class analysis to consider the site as a transmitter and reproducer of hegemonic ideology. Using both
What’s in the Box? An Analysis of Undesirable Content in the Common Crawl Corpus
It is found that the Common Crawl, a colossal web corpus that is extensively used for training language models, contains a significant amount of undesirable content, including hate speech and sexually explicit content, even after filtering procedures.
Early onset of structural inequality in the formation of collaborative knowledge in all Wikimedia projects
An analysis of all Wikimedia projects shows that a small number of editors have a disproportionately large influence in the formation of collective knowledge and develops an agent-based model that considers the characteristics of the editors and successfully reproduces the empirical results.
The Gendered Geography of Contributions to OpenStreetMap: Complexities in Self-Focus Bias
Findings that women are dramatically under-represented as OSM contributors are replicated, and it is found that men and women contribute different types of content and do so about different places, but the character of these differences confound simple narratives about self-focus bias.
Uneven Coverage of Natural Disasters in Wikipedia: the Case of Flood
It is shown how the coverage of floods in Wikipedia is skewed towards rich, English-speaking countries, in particular the US and Canada, and this has implications for systems using Wikipedia or similar collaborative media platforms as an information source for detecting emergencies or for gathering valuable information for disaster response.
Analysis of category co-occurrence in Wikipedia networks
The m-core, a cohesive subgroup concept as a clustering model, is used to construct a subgraph depending on the number of shared pages between the categories exceeding a given threshold t, and the clustering in the category graph is shown to be consistent with the distance between categories in the taxonomy graph.
The Gender Bias Tug-of-War in a Co-creation Community: Core-Periphery Tension on Wikipedia
It is suggested that a balance of activity from central and peripheral contributors results in content with the most neutral point of view, and evidence of bias that advantages women and disadvantages men is found.


Cultural bias in Wikipedia content on famous persons
The extent to which content and perspectives vary across cultures is examined by comparing articles about famous persons in the Polish and English editions of Wikipedia, revealing systematic differences related to the different cultures, histories, and values of Poland and the United States.
First Women, Second Sex: Gender Bias in Wikipedia
This paper analyzes biographical content in Wikipedia in terms of how women and men are characterized in their biographies in three aspects: meta-data, language, and network structure to show that there are differences in characterization and structure.
Gender Bias in Wikipedia and Britannica
Is there a bias in the against women’s representation in Wikipedia biographies? Thousands of biographical subjects, from six sources, are compared against the English-language Wikipedia and the
Who likes me more?: analysing entity-centric language-specific bias in multilingual Wikipedia
A methodology using sentiment analysis techniques to systematically extract the variations in sentiments associated with real-world entities in different language editions of Wikipedia, illustrated with a case study of five Wikipedia language editions and a set of target entities from four categories.
Social media news communities: gatekeeping, coverage, and statement bias
The results, obtained by analyzing 80 international news sources during a two-week period, show that biases are subtle but observable, and follow geographical boundaries more closely than political ones.
Collective Intelligence and Neutral Point of View: The Case of Wikipedia
We examine whether collective intelligence helps achieve a neutral point of view using data from a decade of Wikipedia's articles on US politics. Our null hypothesis builds on Linus' Law, often
It's a Man's Wikipedia? Assessing Gender Inequality in an Online Encyclopedia
This paper presents and applies a computational method for assessing gender bias on Wikipedia along multiple dimensions and finds that while women on Wikipedia are covered and featured well in many Wikipedia language editions, the way women are portrayed starkly differs from the way men are portrayed.
Media Bias in German Online Newspapers
This paper investigates a dataset that covers all political and economical news from four leading German online newspapers over a timespan of four years and proposes a variety of automatically computable measures that can indicate media bias.
Finding News Citations for Wikipedia
This work proposes a two-stage supervised approach to finding and updating news citations for statements in entity pages, and develops a news citation algorithm for Wikipedia statements, which recommends appropriate citations from a given news collection.
Is Wikipedia Really Neutral? A Sentiment Perspective Study of War-related Wikipedia Articles since 1945
This paper tackles the challenge of finding sentiment differences in how Wikipedia articles in different languages describe the same war by specifically analysing a typically controversial topic, such as war, and proposing an automatic methodology based on article level and concept level sentiment analysis on multilingual Wikipedia articles.