HighLife: Higher-arity Fact Harvesting
@article{Ernst2018HighLifeHF, title={HighLife: Higher-arity Fact Harvesting}, author={P. Ernst and A. Siu and Gerhard Weikum}, journal={Proceedings of the 2018 World Wide Web Conference}, year={2018} }
Text-based knowledge extraction methods for populating knowledge bases have focused on binary facts: relationships between two entities. However, in advanced domains such as health, it is often crucial to consider ternary and higher-arity relations. An example is to capture which drug is used for which disease at which dosage (e.g. 2.5 mg/day) for which kinds of patients (e.g., children vs. adults). In this work, we present an approach to harvest higher-arity facts from textual sources. Our…
27 Citations
Extracting Medical Information Using Machine Reading
- Computer Science
- 2018
This paper presents techniques for extracting medical facts (triples) from unstructured sources following the Machine Reading paradigm and shows how it dealt with several deficiencies of SRL-based information extraction (IE), like entity linking with large arguments, copula verbs that are treated as first-class relations, inability to identify relations expressed through nouns, and the lack of scoring of extracted triples.
StuffIE: Semantic Tagging of Unlabeled Facets Using Fine-Grained Information Extraction
- Computer ScienceCIKM
- 2018
This paper exploits the Stanford dependency parsing enhanced by lexical databases such as WordNet to extract nested triple relations and exploits the syntactical dependencies to semantically tag facets using distant learning based on Oxford dictionary.
The Need to Move beyond Triples
- Computer ScienceText2Story@ECIR
- 2020
This vision paper argues that it is time to broaden this view: first to relations of higher arity, complex objects, and events, and then also to knowledge about knowledge: the authors should be able to represent why something is true, that something is not true, and that something happened before something else, or thatsomething is mainly believed.
Context-Compatible Information Fusion for Scientific Knowledge Graphs
- Computer ScienceTPDL
- 2020
The consequences of uncontrolled knowledge graph evolution in real-world scientific libraries using NLM’s PubMed corpus vs. the SemMedDB knowledge base are quantified and the implicit notion of context compatibility is superior to existing methods in terms of both, simplicity and retrieval quality.
Pattern Type Example Derived Clauses Basic Patterns
- Computer Science
- 2018
This work proposes a framework that first resolve the long and complicated sentence structures and then use texture meta-patterns to extract the n-ary tuples with typed entities and their relationships and achieves the highest precision in comparison with the state-of-the-art baselines.
Nested Relation Extraction with Iterative Neural Network
- Computer ScienceCIKM
- 2019
This paper formally formulate the nested relation extraction problem, and comes up with a solution using Iterative Neural Network, and proposes the model to simultaneously consider the word sequence of natural language in the horizontal direction and the DAG structure in the vertical direction.
Pattern Discovery for Wide-Window Open Information Extraction in Biomedical Literature
- Computer Science2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
- 2018
This paper proposes a novel pattern-based information extraction method for the wide-window entities (WW-PIE), which utilizes dependency parsing to break down the long sentences first and then utilizes frequent textual patterns to extract the high-quality information.
Cross-Sentence N-ary Relation Extraction using Lower-Arity Universal Schemas
- Computer ScienceEMNLP
- 2019
This paper proposes a novel approach to cross-sentence n-ary relation extraction based on universal schemas to learn relation representations of lower-arityfacts that result from decomposing higher-arity facts.
Auto-completion for Data Cells in Relational Tables
- Computer ScienceCIKM
- 2019
The CellAutoComplete framework is presented, which makes use of a large table corpus and a knowledge base as data sources, and consists of preprocessing, candidate value finding, and value ranking components.
Relation Extraction Using Distant Supervision: a Survey
- Computer Science
- 2019
This work presents a survey of relation extraction methods that leverage pre-existing structured or semi-structured data to guide the extraction process, introducing a taxonomy of existing methods and describing distant supervision approaches in detail.
References
SHOWING 1-10 OF 53 REFERENCES
KnowLife: a versatile approach for constructing a large knowledge graph for biomedical sciences
- Computer ScienceBMC Bioinformatics
- 2015
This work addresses three major limitations of biomedical KB construction by using a versatile and scalable approach to automatic KB construction, and is the first method that uses consistency checking for biomedical relations.
Coupled temporal scoping of relational facts
- Computer ScienceWSDM '12
- 2012
It is shown that joint inference is effective compared to doing temporal scoping of individual facts independently, on large scale open-domain publicly available time-stamped datasets, such as English Gigaword Corpus and Google Books Ngrams, demonstrating CoTS's effectiveness.
Cross-Sentence N-ary Relation Extraction with Graph LSTMs
- Computer ScienceTACL
- 2017
A general relation extraction framework based on graph long short-term memory networks (graph LSTMs) that can be easily extended to cross-sentence n-ary relation extraction is explored, demonstrating its effectiveness with both conventional supervised learning and distant supervision.
EXTRACTING CONTEXTUALIZED COMPLEX BIOLOGICAL EVENTS WITH RICH GRAPH‐BASED FEATURE SETS
- Computer ScienceComput. Intell.
- 2011
A system for extracting complex events among genes and proteins from biomedical literature, developed in context of the BioNLP’09 Shared Task on Event Extraction, and achieves the best performance on all three subtasks.
ClausIE: clause-based open information extraction
- Computer ScienceWWW '13
- 2013
ClausIE is a novel, clause-based approach to open information extraction, which extracts relations and their arguments from natural language text using a small set of domain-independent lexica, operates sentence by sentence without any post-processing, and requires no training data.
Harvesting facts from textual web sources by constrained label propagation
- Computer ScienceCIKM '11
- 2011
A system called PRAVDA is proposed based on a new kind of label propagation algorithm with a judiciously designed loss function, which iteratively processes the graph to label good temporal facts for a given set of target relations.
Extracting semantically enriched events from biomedical literature
- Computer ScienceBMC Bioinformatics
- 2011
This work has constructed the first practical system that extracts both events and associated, detailed meta-knowledge information from biomedical literature, and can be used to refine search systems, in order to provide an extra search layer beyond entities and assertions.
Scalable knowledge harvesting with high precision and high recall
- Computer ScienceWSDM '11
- 2011
A new notion of ngram-itemsets for richer patterns is proposed, and MaxSat-based constraint reasoning is used on both the quality of patterns and the validity of fact candidates, to use in a scalable system for high-quality knowledge harvesting.
Extraction of temporal facts and events from Wikipedia
- Computer ScienceTempWeb '12
- 2012
A complete information extraction framework that harvests temporal facts and events from semi-structured data and free text of Wikipedia articles to create a temporal ontology and demonstrates the effectiveness of the proposed methods through several experiments.
Xart system: discovering and extracting correlated arguments of n-ary relations from text
- Computer ScienceWIMS
- 2016
Xart system based on a hybrid method using data mining approaches and syntactic analysis to automatically discover and extract relevant information modeled as n-ary relations from text to populate a domain Ontological and Terminological Resource with new instances is presented.