Corpus ID: 1813352

Extreme Extraction: Only One Hour per Relation

@article{Hoffmann2015ExtremeEO,
  title={Extreme Extraction: Only One Hour per Relation},
  author={Raphael Hoffmann and Luke Zettlemoyer and Daniel S. Weld},
  journal={ArXiv},
  year={2015},
  volume={abs/1506.06418}
}
Information Extraction (IE) aims to automatically generate a large knowledge base from natural language text, but progress remains slow. Supervised learning requires copious human annotation, while unsupervised and weakly supervised approaches do not deliver competitive accuracy. As a result, most fielded applications of IE, as well as the leading TAC-KBP systems, rely on significant amounts of manual engineering. Even "Extreme" methods, such as those reported in Freedman et al. 2011, require… Expand
SEER: Auto-Generating Information Extraction Rules from User-Specified Examples
TLDR
The design behind SEER is explained and a user study comparing the system against a commercially available tool in which users create IE rules manually is presented, showing that SEER helps users complete text extraction tasks more quickly, as well as more accurately. Expand
IKE - An Interactive Tool for Knowledge Extraction
TLDR
IKE is a new extraction tool that performs fast, interactive bootstrapping to develop high-quality extraction patterns for targeted relations and is the first interactive extraction tool to seamlessly integrate symbolic and distributional methods for search. Expand
Neural Extractive Search
TLDR
The goals of this paper are to concisely introduce the extractive-search paradigm; and to demonstrate a prototype neural retrieval system for extractive search and its benefits and potential. Expand
End-to-End Learning for Answering Structured Queries Directly over Text
TLDR
This work presents an approach to answer structured queries directly over text data without storing results in a database at the case of knowledge bases where queries are over entities and the relations between them. Expand
A Comparative Study on Structural and Semantic Properties of Sentence Embeddings
TLDR
Evaluating the extent to which sentences carrying similar senses are embedded in close proximity sub-spaces and if they can exploit that structure to align sentences to a knowledge graph provides useful information for developing embedding-based relation extraction methods. Expand
IDEL: In-Database Neural Entity Linking
TLDR
This work presents a novel architecture In-Database Entity Linking (IDEL), in which the analytical RDBMS MonetDB is integrated with neural text mining abilities and a novel similarity function based on joint neural embeddings which is learned via minimizing pairwise contrastive ranking loss. Expand
IDEL: In-Database Entity Linking with Neural Embeddings
TLDR
This work presents a novel architecture, In-Database Entity Linking (IDEL), in which the analytics-optimized RDBMS MonetDB is integrated with neural text mining abilities and proposes a novel similarity function based on joint neural embeddings learned via minimizing pairwise contrastive ranking loss. Expand
Active Learning with Unbalanced Classes and Example-Generation Queries
TLDR
This paper extends the traditional active learning framework by investigating the problem of intelligently switching between various crowdsourcing strategies for obtaining labeled training examples in order to optimally train a classifier, and develops a novel, skew-robust algorithm, called MB-CB, for the control problem. Expand
Tempura: Query Analysis with Structural Templates
TLDR
This work presents structural templates, abstract queries that replace tokens with their linguistic feature forms, as a query grouping method that allows analysts to create query groups with structural similarity at different granularities. Expand
The Intelligent Management of Crowd-Powered Machine Learning
TLDR
The Intelligent Management of Crowd-Powered Machine Learning is presented, a meta-modelling framework for crowd-based learning that automates the very labor-intensive and therefore time-heavy and expensive process of training neural networks. Expand
...
1
2
...

References

SHOWING 1-10 OF 41 REFERENCES
Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations
TLDR
A novel approach for multi-instance learning with overlapping relations that combines a sentence-level extraction model with a simple, corpus-level component for aggregating the individual facts is presented. Expand
Learning 5000 Relational Extractors
TLDR
LUCHS is presented, a self-supervised, relation-specific IE system which learns 5025 relations --- more than an order of magnitude greater than any previous approach --- with an average F1 score of 61%. Expand
Distant supervision for relation extraction without labeled data
TLDR
This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size. Expand
Extreme Extraction – Machine Reading in a Week
TLDR
It is shown that while the recall of the handwritten rules surpasses that of the learning system, the learned system is able to improve the overall recall and F1. Expand
Modeling Relations and Their Mentions without Labeled Text
TLDR
A novel approach to distant supervision that can alleviate the problem of noisy patterns that hurt precision by using a factor graph and applying constraint-driven semi-supervision to train this model without any knowledge about which sentences express the relations in the authors' training KB. Expand
SystemT: a system for declarative information extraction
TLDR
The extraction algebra is described and the effectiveness of the optimization techniques in providing orders of magnitude reduction in the running time of complex extraction tasks are demonstrated. Expand
Feature Engineering for Knowledge Base Construction
TLDR
The approach to KBC is based on joint probabilistic inference and learning, but the group does not see inference as either a panacea or a magic bullet: inference is a tool that allows us to be systematic in how the authors construct, debug, and improve the quality of such systems. Expand
Joint Inference in Information Extraction
TLDR
This paper proposes a joint approach to information extraction, where segmentation of all records and entity resolution are performed together in a single integrated inference process, and is believed to be the first fully joint approach. Expand
Active Learning for Natural Language Parsing and Information Extraction
TLDR
It is shown that active learning can signicantly reduce the number of annotated examples required to achieve a given level of performance for these complex tasks: semantic parsing and information extraction. Expand
Multi-instance Multi-label Learning for Relation Extraction
TLDR
This work proposes a novel approach to multi-instance multi-label learning for RE, which jointly models all the instances of a pair of entities in text and all their labels using a graphical model with latent variables that performs competitively on two difficult domains. Expand
...
1
2
3
4
5
...