Tell me more: an actionable quality model for Wikipedia

  title={Tell me more: an actionable quality model for Wikipedia},
  author={Morten Warncke-Wang and Dan Cosley and John Riedl},
  journal={Proceedings of the 9th International Symposium on Open Collaboration},
In this paper we address the problem of developing actionable quality models for Wikipedia, models whose features directly suggest strategies for improving the quality of a given article. We first survey the literature in order to understand the notion of article quality in the context of Wikipedia and existing approaches to automatically assess article quality. We then develop classification models with varying combinations of more or less actionable features, and find that a model that only… 

Tables from this paper

A Hybrid Model for Quality Assessment of Wikipedia Articles

A hybrid approach combining deep learning with features proposed in the literature is proposed for document quality assessment, which achieves 6.5% higher accuracy than the state of the art in predicting the quality classes of English Wikipedia articles over a novel dataset of around 60k Wikipedia articles.

An Edit-centric Approach for Wikipedia Article Quality Assessment

An edit-centric approach to assess Wikipedia article quality as a complementary alternative to current full document-based techniques and its cost-effectiveness in terms of data and quality requirements is proposed.

Interpolating Quality Dynamics in Wikipedia and Demonstrating the Keilana Effect

A method for measuring article quality in Wikipedia historically and at a finer granularity than was previously possible is described and offered to the research community studying Wikipedia quality dynamics.

Measuring Quality of Collaboratively Edited Documents: The Case of Wikipedia

  • Quang-Vinh DangC. Ignat
  • Computer Science
    2016 IEEE 2nd International Conference on Collaboration and Internet Computing (CIC)
  • 2016
An automatic assessment method of Wikipedia articles quality is presented by analyzing their content in terms of their format features and readability scores and results show improvements both in Terms of accuracy and information gain compared with other existing approaches.

Structural Analysis of Wikigraph to Investigate Quality Grades of Wikipedia Articles

This paper presents a novel approach based on the structural analysis of Wikigraph to automate the estimation of the quality of Wikipedia articles, and shows that these signatures are useful for estimating the quality grades of un-assessed articles with an accuracy surpassing the existing approaches in this direction.

Automatically Labeling Low Quality Content on Wikipedia By Leveraging Patterns in Editing Behaviors

This work proposes an automated labeling approach that identifies the semantic category of historic Wikipedia edits and uses the modified sentences prior to the edit as examples that require that semantic improvement.

Growing Wikipedia Across Languages via Recommendation

This paper presents an end-to-end system for recommending articles for creation that exist in one language but are missing in an- other and finds that personalizing recommendations increases editor engagement by a factor of two and articles created as a result of these recommendations are of comparable quality to organically created articles.

Understanding the 'Quality Motion' of Wikipedia Articles Through Semantic Convergence Analysis

It is found that the quantity of content change is significant in the promoted articles, which complies with Wikipedia’s stated criteria.

Assessing the Quality of Wikipedia Articles

The evaluation on the real-world dataset shows that the latest research in determining the quality of Wikipedia articles outperforms other baseline methods proposed recently.

Relative Quality Assessment of Wikipedia Articles in Different Languages Using Synthetic Measure

This paper proposes to use a synthetic measure for automatic quality evaluation of the articles in different languages based on important features in Wikipedia to help decide which language version is more complete and correct.



Identifying featured articles in wikipedia: writing style matters

A machine learning approach is presented that exploits an article's character trigram distribution and aims to writing style rather than evaluating meta features like the edit history, which is robust, straightforward to implement, and outperforms existing solutions.

On measuring the quality of Wikipedia articles

The experiment shows, that using special-purpose models for information quality captures user sentiment about Wikipedia articles better than using a single model for both categories of articles.


Analysis of the discussion pages and other process-oriented pages within the Wikipedia project helps in understanding how high quality is maintained in a project where anyone may participate with no prior vetting.

Does it matter who contributes: a study on featured articles in the german wikipedia

It is explored on the German Wikipedia whether only the mere number of contributors makes the difference or whether the high quality of featured articles results from having experienced authors contributing with a reputation for high quality contributions.

Measuring article quality in wikipedia: models and evaluation

This paper proposes three article quality measurement models that make use of the interaction data between articles and their contributors derived from the article edit history and proposes a model that combines partial reviewership of contributors as they edit various portions of the articles.

Assessing Information Quality of a Community-Based Encyclopedia

This work proposes seven IQ metrics which can be evaluated automatically and test the set on a representative sample of Wikipedia content, along with a number of statistical characterizations of Wikipedia articles, their content construction, process metadata and social context.

Size matters: word count as a measure of quality on wikipedia

A simple metric -- word count -- is proposed for measuring article quality and it is shown that this metric significantly outperforms the more complex methods described in related work.

Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia

This work explores a significant number of quality indicators, some of them proposed by us and used here for the first time, and study their capability to assess the quality of Wikipedia articles, and explores machine learning techniques to combine these quality indicators into one single assessment judgment.

Predicting quality flaws in user-generated content: the case of wikipedia

A quality flaw model is developed and a dedicated machine learning approach is employed to predict Wikipedia's most important quality flaws, arguing that common binary or multiclass classification approaches are ineffective for the prediction of quality flaws and hence cast quality flaw prediction as a one-class classification problem.

Quality Evaluation of Wikipedia Articles through Edit History and Editor Groups

This paper investigates the edit history of Wikipedia articles and proposes a model of network structure of editors, an algorithm to calculate the network structural indicator restoreratio, and shows that the proposed indicator has better performance in quality evaluation than several existing metrics.