• Corpus ID: 6983197

The Tradeoffs Between Open and Traditional Relation Extraction

@inproceedings{Banko2008TheTB,
  title={The Tradeoffs Between Open and Traditional Relation Extraction},
  author={Michele Banko and Oren Etzioni},
  booktitle={ACL},
  year={2008}
}
Traditional Information Extraction (IE) takes a relation name and hand-tagged examples of that relation as input. [] Key Method We then present a new model for Open IE called O-CRF and show that it achieves increased precision and nearly double the recall than the model employed by TEXTRUNNER, the previous stateof-the-art Open IE system. Second, when the number of target relations is small, and their names are known in advance, we show that O-CRF is able to match the precision of a traditional extraction…

Figures and Tables from this paper

Towards Large-Scale Unsupervised Relation Extraction from the Web
TLDR
A novel unsupervised algorithm is presented that provides a more general treatment of the polysemy and synonymy problems and explicitly disambiguates polysemous relation phrases and groups synonymous ones and achieves significant improvement on recall compared to the previous method.
Identifying Relations for Open Information Extraction
TLDR
Two simple syntactic and lexical constraints on binary relations expressed by verbs are introduced in the ReVerb Open IE system, which more than doubles the area under the precision-recall curve relative to previous extractors such as TextRunner and woepos.
Improving Open Relation Extraction via Sentence Re-Structuring
TLDR
The proposed approach replaces complex sentences by several others that, together, convey the same meaning and are more amenable to extraction by current ORE systems, and succeeds in reducing the processing time and increasing the accuracy of the state-of-the-art ORE Systems.
Confidence measure estimation for Open Information Extraction
TLDR
A new method of confidence estimation for OIE called Relation Confidence Estimator for Open Information Extraction (RCE-OIE), which investigates the incorporation of some proposed features in assigning confidence metric using logistic regression and demonstrates how semantic information can be used in feature-based confidence estimation of Open Relation Extraction to further improve the performance.
An analysis of open information extraction based on semantic role labeling
TLDR
This work investigates the use of semantic role labeling techniques for the task of Open IE and compares SRL-based open extractors with TextRunner, an open extractor which uses shallow syntactic analysis but is able to analyze many more sentences in a fixed amount of time and thus exploit corpus-level statistics.
On Aligning OpenIE Extractions with Knowledge Bases: A Case Study
TLDR
This paper directly evaluates how OIE triples from the OPIEC corpus are related to the DBpedia KB w.r.t. information content and suggests that significant part of Oie triples can be expressed by means of KB formulas instead of individual facts.
Open Relation Extraction and Grounding
TLDR
This work proposes a novel importance-based open RE approach by exploiting the global structure of a dependency tree to extract salient triples from large-scale corpora by leveraging KB triples and weighted context words associated with relational triples.
Comparison of open information extraCtion for english and spanish
TLDR
This work presents a relation extraction algorithm for Open IE in Spanish, based on POS-tagged input and semantic constraints, and shows that the performance is comparable with the stateof-the-art systems, while the system is more robust to noisy input.
Open information extraction based on lexical semantics
TLDR
An open extractor elaborated from the belief that it is not necessary to have an enormous list of patterns or several types of linguistic labels to better perform Open IE is described, demonstrating the feasibility of an extractor based on simple lexical-syntactic patterns.
A Hybrid Method for Open Information Extraction Based on Shallow and Deep Linguistic Analysis
TLDR
Two novel hybrid methods are presented which combine high-performance subset of shallow Open IE systems with the strengths of a deep Open IE system and detect the best trade-off between precision and recall by tuning two combination parameters: sentence length and confidence measure.
...
...

References

SHOWING 1-10 OF 29 REFERENCES
Unsupervised Resolution of Objects and Relations on the Web
TLDR
A scalable, fully-implemented system for SR that runs in O(KN log N) time in the number of extractions N and the maximum number of synonyms per word, K, and introduces a probabilistic relational model for predicting whether two strings are co-referential based on the similarity of the assertions containing them.
TEG—a hybrid approach to information extraction
TLDR
The experiments show that the hybrid approach outperforms both purely statistical and purely knowledge-based systems, while requiring orders of magnitude less manual rule writing and smaller amounts of training data.
Unsupervised named-entity extraction from the Web: An experimental study
Automatic Discovery of Part-Whole Relations
TLDR
This paper presents a supervised, semantically intensive, domain independent approach for the automatic detection of part-whole relations in text and demonstrates the importance of word sense disambiguation for this task.
Learning Syntactic Patterns for Automatic Hypernym Discovery
TLDR
This paper presents a new algorithm for automatically learning hypernym (is-a) relations from text, using "dependency path" features extracted from parse trees and introduces a general-purpose formalization and generalization of these patterns.
Snowball: extracting relations from large plain-text collections
TLDR
This paper develops a scalable evaluation methodology and metrics for the task, and presents a thorough experimental evaluation of Snowball and comparable techniques over a collection of more than 300,000 newspaper documents.
Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text
TLDR
A probabilistic extraction model is described that provides mutual benefits to both "top-down" relational pattern discovery and "bottom-up" relation extraction.
Preemptive Information Extraction using Unrestricted Relation Discovery
TLDR
A technique called Unrestricted Relation Discovery is proposed that discovers all possible relations from texts and presents them as tables in order to extend the boundary of Information Extraction systems.
Extracting Patterns and Relations from the World Wide Web
TLDR
This paper presents a technique which exploits the duality between sets of patterns and relations to grow the target relation starting from a small sample and uses it to extract a relation of (author,title) pairs from the World Wide Web.
On-Demand Information Extraction
TLDR
On-demand Information Extraction (ODIE) aims to completely eliminate the customization effort, and is reported on on experimental results in which the system created useful tables for many topics, demonstrating the feasibility of this approach.
...
...