Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia

@inproceedings{Dalip2009AutomaticQA,
  title={Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia},
  author={Daniel Hasan Dalip and Marcos Andr{\'e} Gonçalves and Marco Cristo and P{\'a}vel Pereira Calado},
  booktitle={JCDL '09},
  year={2009}
}
The old dream of a universal repository containing all the human knowledge and culture is becoming possible through the Internet and the Web. Moreover, this is happening with the direct collaborative, participation of people. Wikipedia is a great example. It is an enormous repository of information with free access and edition, created by the community in a collaborative manner. However, this large amount of information, made available democratically and virtually without any control, raises… Expand
Automatic Assessment of Document Quality in Web Collaborative Digital Libraries
TLDR
This work explores a significant number of quality indicators and study their capability to assess the quality of articles from three Web collaborative digital libraries and explores machine learning techniques to combine these quality indicators into one single assessment. Expand
A Multi-view Approach for the Quality Assessment of Wiki Articles
TLDR
This work proposed to group the indicators in semantically meaningful views of quality and investigated a new approach to combine these views based on a meta-learning method, known as stacking, and demonstrated that it is possible to use this approach in collaborative encyclopedias such as Wikipedia and Wikia. Expand
Measuring Quality of Collaboratively Edited Documents: The Case of Wikipedia
  • Quang-Vinh Dang, C. Ignat
  • Computer Science
  • 2016 IEEE 2nd International Conference on Collaboration and Internet Computing (CIC)
  • 2016
TLDR
An automatic assessment method of Wikipedia articles quality is presented by analyzing their content in terms of their format features and readability scores and results show improvements both in Terms of accuracy and information gain compared with other existing approaches. Expand
Automatically assessing the quality of Wikipedia contents
TLDR
The problem of automatically evaluating the quality of Wikipedia contents is considered, by proposing a supervised approach based on Machine Learning to perform the classification of articles on qualitative bases. Expand
WikiLyzer: Interactive Information Quality Assessment in Wikipedia
Digital libraries and services enable users to access large amounts of data on demand. Yet, quality assessment of information encountered on the Internet remains an elusive open issue. For example,Expand
Feature Analysis for Assessing the Quality of Wikipedia Articles through Supervised Classification
TLDR
The focus is on the analysis of hand-crafted features that can be employed by supervised machine learning techniques to perform the classification of Wikipedia articles on qualitative bases, and a wider set of characteristics connected to Wikipedia articles are taken into account. Expand
On MultiView-Based Meta-learning for Automatic Quality Assessment of Wiki Articles
TLDR
This work investigates the use of meta-learning techniques to combine sets of semantically related quality indicators in order to automatically assess the quality of wiki articles, inspired on the combination of multiple (quality) experts. Expand
Quality Assessment of Peer-Produced Content in Knowledge Repositories Using Big Data and Social Networks: The Case of Implicit Collaboration in Wikipedia
TLDR
This research introduces and defines the concept of implicit collaboration and then identifies two dimensions and four possible areas of collaboration, which have implications for developing automated quality assessment methods for peer-produced content using big data and social networks. Expand
Interactive Quality Analytics of User-generated Content
TLDR
The contribution is an interactive tool that combines automatic classification methods and human interaction in a toolkit, whereby experts can experiment with new quality metrics and share them with authors that need to identify weaknesses to improve a particular article. Expand
The quality of content in open online collaboration platforms: approaches to NLP-supported information quality management in Wikipedia
TLDR
A comprehensive article quality model is defined that aims to consolidate both the quality of writing and the quality criteria defined in multiple Wikipedia guidelines and policies into a single model and an approach for automatically identifying quality flaws in Wikipedia articles is presented. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 34 REFERENCES
Extracting Trust from Domain Analysis: A Case Study on the Wikipedia Project
TLDR
This evaluation, conducted on about 8,000 articles representing 65% of the overall Wikipedia editing activity, shows that the new trust evidence that is extracted from Wikipedia allows us to transparently and automatically compute trust values to isolate articles of great or low quality. Expand
Measuring article quality in wikipedia: models and evaluation
TLDR
This paper proposes three article quality measurement models that make use of the interaction data between articles and their contributors derived from the article edit history and proposes a model that combines partial reviewership of contributors as they edit various portions of the articles. Expand
Evaluating authoritative sources using social networks: an insight from Wikipedia
TLDR
It is believed that the approach presented here could be used to improve the authoritativeness of content found in Wikipedia and similar sources and approaches the problem of quality Wikipedia content from a social network point of view. Expand
Assessing Information Quality of a Community-Based Encyclopedia
TLDR
This work proposes seven IQ metrics which can be evaluated automatically and test the set on a representative sample of Wikipedia content, along with a number of statistical characterizations of Wikipedia articles, their content construction, process metadata and social context. Expand
The Anatomy of a Large-Scale Hypertextual Web Search Engine
TLDR
This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want. Expand
Exploring the Feasibility of Automatically Rating Online Article Quality
We demonstrate the feasibility of building an automatic system to assign quality ratings to articles in Wikipedia, the online encyclopedia. Our preliminary system uses a Maximum EntropyExpand
Access, claims and quality on the internet-Future challenges
TLDR
Following a survey of important developments, this essay suggests dimensions that need to be included in a future web: 1) variants and multiple claims; 2) levels of certainty in making a claim; 3) Levels of authority in defending a claim%; 4) levels in assessing a claim'; 5) levelsof thoroughness in dealing with a claim. Expand
How do users evaluate the credibility of Web sites?: a study with over 2,500 participants
In this study 2,684 people evaluated the credibility of two live Web sites on a similar topic (such as health sites). We gathered the comments people wrote about each siteís credibility and analyzedExpand
IR evaluation methods for retrieving highly relevant documents
TLDR
The novel evaluation methods and the case demonstrate that non-dichotomous relevance assessments are applicable in IR experiments, may reveal interesting phenomena, and allow harder testing of IR methods. Expand
A content-driven reputation system for the wikipedia
TLDR
The results show that the notion of reputation has good predictive value: changes performed by low-reputation authors have a significantly larger than average probability of having poor quality, as judged by human observers, and of being later undone, as measured by the algorithms. Expand
...
1
2
3
4
...