Crowdsourcing Semantic Label Propagation in Relation Classification

  title={Crowdsourcing Semantic Label Propagation in Relation Classification},
  author={Anca Dumitrache and Lora Aroyo and Chris Welty},
Distant supervision is a popular method for performing relation extraction from text that is known to produce noisy labels. Most progress in relation extraction and classification has been made with crowdsourced corrections to distant-supervised labels, and there is evidence that indicates still more would be better. In this paper, we explore the problem of propagating human annotation signals gathered for open-domain relation classification through the CrowdTruth methodology for crowdsourcing… 

Figures from this paper

Model Positionality and Computational Reflexivity: Promoting Reflexivity in Data Science
This work introduces the concepts of model positionality and computational reflexivity that can help data scientists to reflect on and communicate the social and cultural context of a model’s development and use, the data annotators and their annotations, and the data scientists themselves.
Knowledge Graphs Evolution and Preservation - A Technical Report from ISWS 2019
This document reports a collaborative effort performed by nine teams of students, each guided by a senior researcher as their mentor, attending the International Semantic Web Research School (ISWS 2019), which provides a different perspective to the problem of knowledge graph evolution substantiated by a set of research questions as the main subject of their investigation.
Scaling and Disagreements: Bias, Noise, and Ambiguity
This paper investigates the use of an approach developed to estimate noise, temperature scaling, in learning from data containing disagreements and finds that temperature scaling works with data in which the disagreements are the result of label overlap, but not with data which are due to annotator bias.
Enhancing Quality of Corpus Annotation: Construction of the Multi-Layer Corpus Annotation and Simplified Validation of the Corpus Annotation
Based on the validation results, the tendency of annotation across the entire corpus is observed macroscopically, and the corpus annotation validation results are analyzed microscopically to verify the validation methodology to address the case study.


False Positive and Cross-relation Signals in Distant Supervision Data
The results of a crowdsourcing relation extraction task are used to identify two problems with DS data quality: the widely varying degree of false positives across different relations, and the observed causal connection between relations that are not considered by the DS method.
Big Data versus the Crowd: Looking for Relationships in All the Right Places
The experiments show that increasing the corpus size for distant supervision has a statistically significant, positive impact on quality (F1 score), and human feedback has a positive and statistically significant; but lower, impact on precision and recall.
Distant Supervision for Relation Extraction with Sentence-Level Attention and Entity Descriptions
This paper proposes a sentence-level attention model to select the valid instances, which makes full use of the supervision information from knowledge bases, and extracts entity descriptions from Freebase and Wikipedia pages to supplement background knowledge for the authors' task.
Infusion of Labeled Data into Distant Supervision for Relation Extraction
This paper demonstrates how a state-of-theart multi-instance multi-label model can be modified to make use of reliable sentence-level labels in addition to the relation-level distant supervision from a database.
Combining Distant and Partial Supervision for Relation Extraction
This work presents an approach for providing partial supervision to a distantly supervised relation extractor using a small number of carefully selected examples, and proposes a novel criterion to sample examples which are both uncertain and representative.
Effective Deep Memory Networks for Distant Supervised Relation Extraction
A novel neural approach for distant supervised RE with special focus on attention mechanisms is introduced, which includes two major attention-based memory components, which are capable of explicitly capturing the importance of each context word for modeling the representation of the entity pair.
Relation Extraction Using Label Propagation Based Semi-Supervised Learning
This paper investigates a graph based semi-supervised learning algorithm, a label propagation (LP) algorithm, for relation extraction that represents labeled and unlabeled examples and their distances as the nodes and the weights of edges of a graph, and tries to obtain a labeling function to satisfy two constraints.
Distant supervision for relation extraction without labeled data
This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size.
Relation Extraction: Perspective from Convolutional Neural Networks
This work introduces a convolutional neural network for relation extraction that automatically learns features from sentences and minimizes the dependence on external toolkits and resources.