MinIE: Minimizing Facts in Open Information Extraction

@inproceedings{Gashteovski2017MinIEMF,
  title={MinIE: Minimizing Facts in Open Information Extraction},
  author={Kiril Gashteovski and Rainer Gemulla and Luciano Del Corro},
  booktitle={EMNLP},
  year={2017}
}
The goal of Open Information Extraction (OIE) is to extract surface relations and their arguments from natural-language text in an unsupervised, domain-independent manner. In this paper, we propose MinIE, an OIE system that aims to provide useful, compact extractions with high precision and recall. MinIE approaches these goals by (1) representing information about polarity, modality, attribution, and quantities with semantic annotations instead of in the actual extraction, and (2) identifying… 

Figures and Tables from this paper

Open Information Extraction on Scientific Text: An Evaluation
TLDR
It is found that OIE systems perform significantly worse on scientific text than encyclopedic text, and an error analysis is provided to suggest areas of work to reduce errors.
OPIEC: An Open Information Extraction Corpus
TLDR
It is found that most of the facts between entities present in OPIEC cannot be found in DBpedia and/or YAGO, that OIE facts often differ in the level of specificity compared to knowledge base facts, and that Oie open relations are generally highly polysemous.
Extraction on Scientific Text : An Evaluation
TLDR
It is found that OIE systems perform significantly worse on scientific text than encyclopedic text, and an error analysis is provided to suggest areas of work to reduce errors.
Weakly Supervised, Data-Driven Acquisition of Rules for Open Information Extraction
TLDR
A way to acquire rules for Open Information Extraction, based on lemma sequence patterns (including potential typographical symbols) linking two named entities in a sentence, is proposed, which does not necessitate expensive resources or time-consuming handcrafted resources, but does require a large amount of text.
AnnIE: An Annotation Platform for Constructing Complete Open Information Extraction Benchmark
TLDR
AnnIE is proposed: an interactive annotation platform that facilitates such challenging annotation tasks and supports creation of complete fact-oriented OIE evaluation benchmarks and is modular and flexible in order to support different use case scenarios and different languages.
On Aligning OpenIE Extractions with Knowledge Bases: A Case Study
TLDR
This paper directly evaluates how OIE triples from the OPIEC corpus are related to the DBpedia KB w.r.t. information content and suggests that significant part of Oie triples can be expressed by means of KB formulas instead of individual facts.
GenIE: Generative Information Extraction
TLDR
This work introduces GenIE (generative information extraction), the first end-to-end autoregressive formulation of closed information extraction, and paves the way towards a unified end- to-end approach to the core tasks of information extraction.
WiRe57 : A Fine-Grained Benchmark for Open Information Extraction
TLDR
The non-trivial problem of evaluating the extractions produced by systems against the reference tuples is addressed, and the MinIE system is found to perform best.
Open Information Extraction with Global Structure Constraints
TLDR
A novel open IE system, called ReMine, is proposed, which integrates local context signal and global structural signal in a unified framework with distant supervision and can effectively score sentence-level tuple extractions based on corpus-level statistics.
Integrating Local Context and Global Cohesiveness for Open Information Extraction
TLDR
This paper proposes a novel Open IE system, called ReMine, which integrates local context signals and global structural signals in a unified, distant-supervision framework that can be applied to many different domains to facilitate sentence-level tuple extractions using corpus-level statistics.
...
...

References

SHOWING 1-10 OF 30 REFERENCES
Identifying Relations for Open Information Extraction
TLDR
Two simple syntactic and lexical constraints on binary relations expressed by verbs are introduced in the ReVerb Open IE system, which more than doubles the area under the precision-recall curve relative to previous extractors such as TextRunner and woepos.
Leveraging Linguistic Structure For Open Domain Information Extraction
TLDR
This work replaces this large pattern set with a few patterns for canonically structured sentences, and shifts the focus to a classifier which learns to extract self-contained clauses from longer sentences to determine the maximally specific arguments for each candidate triple.
Open Language Learning for Information Extraction
Open Information Extraction (IE) systems extract relational tuples from text, without requiring a pre-specified vocabulary, by identifying relation phrases and associated arguments in arbitrary
KrakeN: N-ary Facts in Open Information Extraction
TLDR
KrakeN is an OIE system specifically designed to capture N-ary facts, as well as the results of an experimental study on extracting facts from Web text in which the issue of fact completeness is examined.
Nested Propositions in Open Information Extraction
TLDR
NESTIE is proposed, which uses a nested representation to extract higher-order relations, and complex, interdependent assertions, and Nesting the extracted propositions allows NESTIE to more accurately reflect the meaning of the original sentence.
ClausIE: clause-based open information extraction
TLDR
ClausIE is a novel, clause-based approach to open information extraction, which extracts relations and their arguments from natural language text using a small set of domain-independent lexica, operates sentence by sentence without any post-processing, and requires no training data.
Dependency-Based Open Information Extraction
TLDR
A new multilingual OIE system based on robust and fast rule-based dependency parsing that permits to extract more precise assertions from text than state of the art OIE systems, keeping a crucial property of those systems: scaling to Web-size document collections.
Open Information Extraction via Contextual Sentence Decomposition
TLDR
It is shown how contextual sentence decomposition (CSD), a technique originally developed for high-precision semantic search, can be used for open information extraction (OIE), and how CSD-IE achieves precision and recall comparable to ClausIE, but at significantly better minimality.
Open Information Extraction Using Wikipedia
TLDR
WOE is presented, an open IE system which improves dramatically on TextRunner's precision and recall and is a novel form of self-supervised learning for open extractors -- using heuristic matches between Wikipedia infobox attribute values and corresponding sentences to construct training data.
Creating a Large Benchmark for Open Information Extraction
TLDR
This work develops a methodology that leverages the recent QA-SRL annotation to create a first independent and large scale Open IE annotation and uses it to automatically compare the most prominent Open IE systems.
...
...