HighLife: Higher-arity Fact Harvesting

@article{Ernst2018HighLifeHF,
  title={HighLife: Higher-arity Fact Harvesting},
  author={P. Ernst and A. Siu and Gerhard Weikum},
  journal={Proceedings of the 2018 World Wide Web Conference},
  year={2018}
}
  • P. Ernst, A. Siu, G. Weikum
  • Published 10 April 2018
  • Computer Science
  • Proceedings of the 2018 World Wide Web Conference
Text-based knowledge extraction methods for populating knowledge bases have focused on binary facts: relationships between two entities. However, in advanced domains such as health, it is often crucial to consider ternary and higher-arity relations. An example is to capture which drug is used for which disease at which dosage (e.g. 2.5 mg/day) for which kinds of patients (e.g., children vs. adults). In this work, we present an approach to harvest higher-arity facts from textual sources. Our… 

Figures and Tables from this paper

Extracting Medical Information Using Machine Reading
  • Computer Science
  • 2018
TLDR
This paper presents techniques for extracting medical facts (triples) from unstructured sources following the Machine Reading paradigm and shows how it dealt with several deficiencies of SRL-based information extraction (IE), like entity linking with large arguments, copula verbs that are treated as first-class relations, inability to identify relations expressed through nouns, and the lack of scoring of extracted triples.
StuffIE: Semantic Tagging of Unlabeled Facets Using Fine-Grained Information Extraction
TLDR
This paper exploits the Stanford dependency parsing enhanced by lexical databases such as WordNet to extract nested triple relations and exploits the syntactical dependencies to semantically tag facets using distant learning based on Oxford dictionary.
The Need to Move beyond Triples
TLDR
This vision paper argues that it is time to broaden this view: first to relations of higher arity, complex objects, and events, and then also to knowledge about knowledge: the authors should be able to represent why something is true, that something is not true, and that something happened before something else, or thatsomething is mainly believed.
Context-Compatible Information Fusion for Scientific Knowledge Graphs
TLDR
The consequences of uncontrolled knowledge graph evolution in real-world scientific libraries using NLM’s PubMed corpus vs. the SemMedDB knowledge base are quantified and the implicit notion of context compatibility is superior to existing methods in terms of both, simplicity and retrieval quality.
Pattern Type Example Derived Clauses Basic Patterns
TLDR
This work proposes a framework that first resolve the long and complicated sentence structures and then use texture meta-patterns to extract the n-ary tuples with typed entities and their relationships and achieves the highest precision in comparison with the state-of-the-art baselines.
Nested Relation Extraction with Iterative Neural Network
TLDR
This paper formally formulate the nested relation extraction problem, and comes up with a solution using Iterative Neural Network, and proposes the model to simultaneously consider the word sequence of natural language in the horizontal direction and the DAG structure in the vertical direction.
Pattern Discovery for Wide-Window Open Information Extraction in Biomedical Literature
TLDR
This paper proposes a novel pattern-based information extraction method for the wide-window entities (WW-PIE), which utilizes dependency parsing to break down the long sentences first and then utilizes frequent textual patterns to extract the high-quality information.
Cross-Sentence N-ary Relation Extraction using Lower-Arity Universal Schemas
TLDR
This paper proposes a novel approach to cross-sentence n-ary relation extraction based on universal schemas to learn relation representations of lower-arityfacts that result from decomposing higher-arity facts.
Auto-completion for Data Cells in Relational Tables
TLDR
The CellAutoComplete framework is presented, which makes use of a large table corpus and a knowledge base as data sources, and consists of preprocessing, candidate value finding, and value ranking components.
Relation Extraction Using Distant Supervision: a Survey
TLDR
This work presents a survey of relation extraction methods that leverage pre-existing structured or semi-structured data to guide the extraction process, introducing a taxonomy of existing methods and describing distant supervision approaches in detail.
...
1
2
3
...

References

SHOWING 1-10 OF 53 REFERENCES
KnowLife: a versatile approach for constructing a large knowledge graph for biomedical sciences
TLDR
This work addresses three major limitations of biomedical KB construction by using a versatile and scalable approach to automatic KB construction, and is the first method that uses consistency checking for biomedical relations.
Coupled temporal scoping of relational facts
TLDR
It is shown that joint inference is effective compared to doing temporal scoping of individual facts independently, on large scale open-domain publicly available time-stamped datasets, such as English Gigaword Corpus and Google Books Ngrams, demonstrating CoTS's effectiveness.
Cross-Sentence N-ary Relation Extraction with Graph LSTMs
TLDR
A general relation extraction framework based on graph long short-term memory networks (graph LSTMs) that can be easily extended to cross-sentence n-ary relation extraction is explored, demonstrating its effectiveness with both conventional supervised learning and distant supervision.
EXTRACTING CONTEXTUALIZED COMPLEX BIOLOGICAL EVENTS WITH RICH GRAPH‐BASED FEATURE SETS
TLDR
A system for extracting complex events among genes and proteins from biomedical literature, developed in context of the BioNLP’09 Shared Task on Event Extraction, and achieves the best performance on all three subtasks.
ClausIE: clause-based open information extraction
TLDR
ClausIE is a novel, clause-based approach to open information extraction, which extracts relations and their arguments from natural language text using a small set of domain-independent lexica, operates sentence by sentence without any post-processing, and requires no training data.
Harvesting facts from textual web sources by constrained label propagation
TLDR
A system called PRAVDA is proposed based on a new kind of label propagation algorithm with a judiciously designed loss function, which iteratively processes the graph to label good temporal facts for a given set of target relations.
Extracting semantically enriched events from biomedical literature
TLDR
This work has constructed the first practical system that extracts both events and associated, detailed meta-knowledge information from biomedical literature, and can be used to refine search systems, in order to provide an extra search layer beyond entities and assertions.
Scalable knowledge harvesting with high precision and high recall
TLDR
A new notion of ngram-itemsets for richer patterns is proposed, and MaxSat-based constraint reasoning is used on both the quality of patterns and the validity of fact candidates, to use in a scalable system for high-quality knowledge harvesting.
Extraction of temporal facts and events from Wikipedia
TLDR
A complete information extraction framework that harvests temporal facts and events from semi-structured data and free text of Wikipedia articles to create a temporal ontology and demonstrates the effectiveness of the proposed methods through several experiments.
Xart system: discovering and extracting correlated arguments of n-ary relations from text
TLDR
Xart system based on a hybrid method using data mining approaches and syntactic analysis to automatically discover and extract relevant information modeled as n-ary relations from text to populate a domain Ontological and Terminological Resource with new instances is presented.
...
1
2
3
4
5
...