Modeling sequential annotations for sequence labeling with crowds
@article{Lu2021ModelingSA, title={Modeling sequential annotations for sequence labeling with crowds}, author={Xiaolei Lu and Tommy W. S. Chow}, journal={IEEE transactions on cybernetics}, year={2021}, volume={PP} }
Crowd sequential annotations can be an efficient and cost-effective way to build large datasets for sequence labeling. Different from tagging independent instances, for crowd sequential annotations, the quality of label sequence relies on the expertise level of annotators in capturing internal dependencies for each token in the sequence. In this article, we propose modeling sequential annotation for sequence labeling with crowds (SA-SLC). First, a conditional probabilistic model is developed to…
Figures and Tables from this paper
One Citation
Classification-oriented dawid skene model for transferring intelligence from crowds to machines
- Computer ScienceFrontiers of Computer Science
- 2022
A Classification-Oriented Dawid Skene (CODS) model is developed, which achieves the three objectives simultaneously in this context, namely, to learn a classifier that is capable of labelling future items without further assistance of crowd workers.
References
SHOWING 1-10 OF 33 REFERENCES
A Bayesian Approach for Sequence Tagging with Crowds
- Computer ScienceEMNLP
- 2019
This work proposes a Bayesian method for aggregating sequence tags that reduces errors by modelling sequential dependencies between the annotations as well as the ground-truth labels and finds that this approach can reduce crowdsourcing costs through more effective active learning, as it better captures uncertainty in the sequence labels when there are few annotations.
Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling
- Computer ScienceACL
- 2020
A novel framework Consensus Network that can be trained on annotations from multiple sources and dynamically aggregates source-specific knowledge by a context-aware attention module that leads to a model reflecting the agreement among multiple sources is proposed.
Sembler: Ensembling Crowd Sequential Labeling for Improved Quality
- Computer ScienceAAAI
- 2012
The proposed Sembler model, a statistical model for ensembling crowd sequential labelings, is evaluated on a real Twitter and a synthetical biological data set, and finds that Sembler is particularly accurate when more than half of annotators make mistakes.
Sequence labeling with multiple annotators
- Computer ScienceMachine Learning
- 2013
A probabilistic approach for sequence labeling using Conditional Random Fields (CRF) for situations where label sequences from multiple annotators are available but there is no actual ground truth.
Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks
- Computer ScienceJ. Mach. Learn. Res.
- 2012
An empirical Bayesian algorithm called SpEM is proposed that iteratively eliminates the spammers and estimates the consensus labels based only on the good annotators and is motivated by defining a spammer score that can be used to rank the annotators.
Aggregating and Predicting Sequence Labels from Crowd Annotations
- Computer ScienceACL
- 2017
A suite of methods for aggregating sequential crowd labels to infer a best single set of consensus annotations and using crowd annotations as training data for a model that can predict sequences in unannotated text are evaluated.
Learning From Crowds
- Computer ScienceJ. Mach. Learn. Res.
- 2010
A probabilistic approach for supervised learning when the authors have multiple annotators providing (possibly noisy) labels but no absolute gold standard, and experimental results indicate that the proposed method is superior to the commonly used majority voting baseline.
Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
- Computer ScienceEMNLP
- 2008
This work explores the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web, and proposes a technique for bias correction that significantly improves annotation quality on two tasks.
Modeling annotator expertise: Learning when everybody knows a bit of something
- Computer ScienceAISTATS
- 2010
This paper develops a probabilistic approach to this problem when annotators may be unreliable, but also their expertise varies depending on the data they observe, which provides clear advantages over previously introduced multi-annotator methods.
Learning from crowdsourced labeled data: a survey
- Computer ScienceArtificial Intelligence Review
- 2016
This survey introduces the basic concepts of the qualities of labels and learning models, and introduces open accessible real-world data sets collected from crowdsourcing systems and open source libraries and tools.