# Fast Dawid-Skene: A Fast Vote Aggregation Scheme for Sentiment Classification

@article{Sinha2018FastDA, title={Fast Dawid-Skene: A Fast Vote Aggregation Scheme for Sentiment Classification}, author={Vaibhav Sinha and Sukrut Rao and V. Balasubramanian}, journal={arXiv: Machine Learning}, year={2018} }

Many real world problems can now be effectively solved using supervised machine learning. A major roadblock is often the lack of an adequate quantity of labeled data for training. A possible solution is to assign the task of labeling data to a crowd, and then infer the true label using aggregation methods. A well-known approach for aggregation is the Dawid-Skene (DS) algorithm, which is based on the principle of Expectation-Maximization (EM). We propose a new simple, yet effective, EM-based… Expand

#### 3 Citations

Improving Human-Labeled Data through Dynamic Automatic Conflict Resolution

- Computer Science
- COLING
- 2020

A scalable methodology for estimating the noisiness of labels produced by a typical crowdsourcing semantic annotation task and reducing the resulting error of the labeling process by as much as 20-30% in comparison to other common labeling strategies is developed and implemented. Expand

Discovering Biased News Articles Leveraging Multiple Human Annotations

- Computer Science
- LREC
- 2020

The goal is to compare domain experts to crowd workers and also to prove that media bias can be detected automatically, and to contribute to a trustworthy media ecosystem by automatically identifying politically biased news articles. Expand

Some people aren't worth listening to: periodically retraining classifiers with feedback from a team of end users

- Computer Science, Mathematics
- ArXiv
- 2020

A classifier is demonstrated that can learn which users tend to be unreliable, filtering their feedback out of the loop, thus improving performance in subsequent iterations. Expand

#### References

SHOWING 1-10 OF 40 REFERENCES

Active Learning for Crowd-Sourced Databases

- Computer Science
- ArXiv
- 2012

Two new active learning algorithms are presented to combine humans and algorithms together in a crowd-sourced database, based on the theory of non-parametric bootstrap, which makes their results applicable to a broad class of machine learning models. Expand

Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales

- Computer Science
- ACL
- 2005

A meta-algorithm is applied, based on a metric labeling formulation of the rating-inference problem, that alters a given n-ary classifier's output in an explicit attempt to ensure that similar items receive similar labels. Expand

Variational Inference for Crowdsourcing

- Computer Science
- NIPS
- 2012

By choosing the prior properly, both BP and MF perform surprisingly well on both simulated and real-world datasets, competitive with state-of-the-art algorithms based on more complicated modeling assumptions. Expand

Learning Supervised Topic Models for Classification and Regression from Crowds

- Computer Science, Mathematics
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- 2017

This article proposes two supervised topic models, one for classification and another for regression problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds and develops an efficient stochastic variational inference algorithm. Expand

Using Crowdsourcing and Active Learning to Track Sentiment in Online Media

- Computer Science
- ECAI
- 2010

A system for tracking economic sentiment in online media that has been deployed since August 2009 is described, which uses annotations provided by a cohort of non-expert annotators to train a learning system to classify a large body of news items. Expand

Online crowdsourcing: Rating annotators and obtaining cost-effective labels

- Computer Science
- 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops
- 2010

A model of the labeling process which includes label uncertainty, as well a multi-dimensional measure of the annotators' ability is proposed, from which an online algorithm is derived that estimates the most likely value of the labels and the annotator abilities. Expand

Minimax Optimal Convergence Rates for Estimating Ground Truth from Crowdsourced Labels

- Mathematics
- 2013

Crowdsourcing has become a primary means for label collection in many real-world machine learning applications. A classical method for inferring the true labels from the noisy labels provided by… Expand

The Multidimensional Wisdom of Crowds

- Computer Science
- NIPS
- 2010

A method for estimating the underlying value of each image from (noisy) annotations provided by multiple annotators, based on a model of the image formation and annotation process, which predicts ground truth labels on both synthetic and real data more accurately than state of the art methods. Expand

Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2016

Experimental results demonstrate that the proposed algorithm for multi-class crowd labeling problems is comparable to the most accurate empirical approach, while outperforming several other recently proposed methods. Expand

Deep learning from crowds

- Computer Science, Mathematics
- AAAI
- 2018

A novel general-purpose crowd layer is proposed, which allows us to train deep neural networks end-to-end, directly from the noisy labels of multiple annotators, using only backpropagation. Expand