Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction

@article{Sun2018LogicianAU,
  title={Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction},
  author={Mingming Sun and Xu Li and Xin Wang and Miao Fan and Yue Feng and Ping Li},
  journal={Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining},
  year={2018}
}
  • Mingming Sun, Xu Li, +3 authors P. Li
  • Published 2 February 2018
  • Computer Science
  • Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining
In this paper, we consider the problem of open information extraction (OIE) for extracting entity and relation level intermediate structures from sentences in open-domain. We focus on four types of valuable intermediate structures (Relation, Attribute, Description, and Concept), and propose a unified knowledge expression form, SAOKE, to express them. We publicly release a data set which contains 48,248 sentences and the corresponding facts in the SAOKE format labeled by crowdsourcing. To our… Expand
Semi-Open Information Extraction
TLDR
This paper proposes a large-scale human-annotated benchmark called SOIED, consisting of 61,984 facts for 8,013 subject entities annotated on 24,000 Chinese sentences collected from the web search engine, and proposes a novel unified model called USE for this task. Expand
Improving Open Information Extraction with Distant Supervision Learning
TLDR
A distant supervision learning approach is employed to improve the Open IE task by employing two popular sequence-to-sequence models (RNN and Transformer) and a large benchmark data set to demonstrate the performance of this approach. Expand
A Predicate-Function-Argument Annotation of Natural Language for Open-Domain Information Expression
TLDR
A new pipeline to build OIE systems is proposed, where an Open-domain Information eXpression (OIX) task is proposed to provide a platform for all OIE strategies to be developed on the platform of OIX as inference operations focusing on more critical problems. Expand
Explainable OpenIE Classifier with Morpho-syntactic Rules
TLDR
This paper introduces TabOIEC, a multilingual classifier based on generic morphosyntactic features which carries a glass-box method which can provide interpretation about some of the classifier decisions and considers that for all languages the approach improves F1 measures, particularly for monolinguality. Expand
IMoJIE: Iterative Memory-Based Joint Open Information Extraction
TLDR
IMoJIE is presented, an extension to CopyAttention, which produces the next extraction conditioned on all previously extracted tuples, establishing a new state of the art for the task. Expand
CrossOIE: Cross-Lingual Classifier for Open Information Extraction
TLDR
The CrossOIE is presented, a multilingual publicly available relation tuple validity classifier that scores OpenIE systems’ extractions based on their estimated quality and can be used to improve Open IE systems and assist in the creation of Open IE benchmarks for different languages. Expand
Improving Open Information Extraction via Iterative Rank-Aware Learning
TLDR
This work finds that the extraction likelihood, a confidence measure used by current supervised open IE systems, is not well calibrated when comparing the quality of assertions extracted from different sentences, and proposes an additional binary classification loss to calibrate the likelihood to make it more globally comparable. Expand
Supervising Unsupervised Open Information Extraction Models
TLDR
A novel supervised open information extraction framework that leverages an ensemble of unsupervised Open IE systems and a small amount of labeled data to improve system performance and has demonstrated the superiority of the proposed method over existing supervised and unsuper supervised models by a significant margin. Expand
OpenIE6: Iterative Grid Labeling and Coordination Analysis for Open Information Extraction
TLDR
This paper presents an iterative labeling-based system that establishes a new state of the art for OpenIE, while extracting 10x faster, through a novel Iterative Grid Labeling (IGL) architecture, which treats OpenIE as a 2-D grid labeling task. Expand
Pattern Learning for Chinese Open Information Extraction
TLDR
PLCOIE can extract binary relation triples as well as N-ary relation tuples, and experiments show that the results are more precise than state-of-the-art Chinese OIE systems, which indicate that P LCOIE is feasible and effective. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 56 REFERENCES
An analysis of open information extraction based on semantic role labeling
TLDR
This work investigates the use of semantic role labeling techniques for the task of Open IE and compares SRL-based open extractors with TextRunner, an open extractor which uses shallow syntactic analysis but is able to analyze many more sentences in a fixed amount of time and thus exploit corpus-level statistics. Expand
ClausIE: clause-based open information extraction
TLDR
ClausIE is a novel, clause-based approach to open information extraction, which extracts relations and their arguments from natural language text using a small set of domain-independent lexica, operates sentence by sentence without any post-processing, and requires no training data. Expand
Open question answering over curated and extracted knowledge bases
TLDR
This paper presents OQA, the first approach to leverage both curated and extracted KBs, and demonstrates that it achieves up to twice the precision and recall of a state-of-the-art Open QA system. Expand
End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures
TLDR
A novel end-to-end neural model to extract entities and relations between them and compares favorably to the state-of-the-art CNN based model (in F1-score) on nominal relation classification (SemEval-2010 Task 8). Expand
Open Language Learning for Information Extraction
Open Information Extraction (IE) systems extract relational tuples from text, without requiring a pre-specified vocabulary, by identifying relation phrases and associated arguments in arbitraryExpand
Open Information Extraction Systems and Downstream Applications
  • Mausam
  • Computer Science
  • IJCAI
  • 2016
TLDR
A decade of progress on building Open IE extractors is described, which results in the latest extractor, OPENIE4, which is computationally efficient, outputs n-ary and nested relations, and also outputs relations mediated by nouns in addition to verbs. Expand
Adapting Open Information Extraction to Domain-Specific Relations
TLDR
The steps needed to adapt Open IE to a domain-specific ontology are explored and the approach of mapping domain-independent tuples to an ontology using domains from DARPA’s Machine Reading Project is demonstrated. Expand
Open Information Extraction: The Second Generation
TLDR
The second generation of Open IE systems are described, which rely on a novel model of how relations and their arguments are expressed in English sentences to double precision/recall compared with previous systems such as TEXTRUNNER and WOE. Expand
Distant supervision for relation extraction without labeled data
TLDR
This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size. Expand
Neural Relation Extraction with Selective Attention over Instances
TLDR
A sentence-level attention-based model for relation extraction that employs convolutional neural networks to embed the semantics of sentences and dynamically reduce the weights of those noisy instances. Expand
...
1
2
3
4
5
...