Domain-Targeted, High Precision Knowledge Extraction

@article{Dalvi2017DomainTargetedHP,
  title={Domain-Targeted, High Precision Knowledge Extraction},
  author={Bhavana Dalvi and Niket Tandon and Peter E. Clark},
  journal={Transactions of the Association for Computational Linguistics},
  year={2017},
  volume={5},
  pages={233-246}
}
Our goal is to construct a domain-targeted, high precision knowledge base (KB), containing general (subject,predicate,object) statements about the world, in support of a downstream question-answering (QA) application. [...] Key Method To address these, we have created a domain-targeted, high precision knowledge extraction pipeline, leveraging Open IE, crowdsourcing, and a novel canonical schema learning algorithm (called CASI), that produces high precision knowledge targeted to a particular domain - in our case…Expand
Combining Analogy with Language Models for Knowledge Extraction
  • 2021
Learning structured knowledge from natural language text has been a long-standing challenge. Previous work has focused on specific domains, mostly extracting knowledge about named entities (e.g.Expand
GenericsKB: A Knowledge Base of Generic Statements
We present a new resource for the NLP community, namely a large (3.5M+ sentence) knowledge base of *generic statements*, e.g., "Trees remove carbon dioxide from the atmosphere", collected fromExpand
Knowledge Completion for Generics using Guided Tensor Factorization
TLDR
This work considers the problem of inferring additional such facts about common nouns or generics at a precision similar to that of the starting KB, and presents the first approach that is successful. Expand
On Aligning OpenIE Extractions with Knowledge Bases: A Case Study
TLDR
This paper directly evaluates how OIE triples from the OPIEC corpus are related to the DBpedia KB w.r.t. information content and suggests that significant part of Oie triples can be expressed by means of KB formulas instead of individual facts. Expand
Weakly Supervised, Data-Driven Acquisition of Rules for Open Information Extraction
TLDR
A way to acquire rules for Open Information Extraction, based on lemma sequence patterns (including potential typographical symbols) linking two named entities in a sentence, is proposed, which does not necessitate expensive resources or time-consuming handcrafted resources, but does require a large amount of text. Expand
A High Precision Pipeline for Financial Knowledge Graph Construction
TLDR
This paper develops a high precision knowledge extraction pipeline tailored for the financial domain that combines multiple information extraction techniques with a financial dictionary that is built, all working together to produce over 342,000 compact extractions from over 288,000 financial news articles. Expand
On the Limits of Machine Knowledge: Completeness, Recall and Negation in Web-scale Knowledge Bases
TLDR
This tutorial discusses how completeness, recall and negation in DBs and KBs can be represented, extracted, and inferred in knowledge bases. Expand
On the Limits of Aligning OpenIE Extractions with Knowledge Bases
  • 2020
Open information extraction is the task of extracting relations and their corresponding arguments from a natural language sentence in an unsupervised manner. Outputs of such systems are used forExpand
Commonsense Properties from Query Logs and Question Answering Forums
TLDR
Quasimodo, a methodology and tool suite for distilling commonsense properties from non-standard web sources that focuses on salient properties that are typically associated with certain objects or concepts, is presented. Expand
ExBERT: An External Knowledge Enhanced BERT for Natural Language Inference
TLDR
A new model for NLI called External Knowledge Enhanced BERT (ExBERT) is introduced, to enrich the contextual representation with realworld commonsense knowledge from external knowledge sources and enhance BERT’s language understanding and reasoning capabilities. Expand
...
1
2
3
4
...

References

SHOWING 1-10 OF 39 REFERENCES
Knowledge vault: a web-scale approach to probabilistic knowledge fusion
TLDR
The Knowledge Vault is a Web-scale probabilistic knowledge base that combines extractions from Web content (obtained via analysis of text, tabular data, page structure, and human annotations) with prior knowledge derived from existing knowledge repositories that computes calibrated probabilities of fact correctness. Expand
Yago: a core of semantic knowledge
TLDR
YAGO builds on entities and relations and currently contains more than 1 million entities and 5 million facts, which includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as HASONEPRIZE). Expand
Open Information Extraction from the Web
TLDR
Open IE (OIE), a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input, is introduced. Expand
Unsupervised Methods for Determining Object and Relation Synonyms on the Web
TLDR
This paper presents a scalable, fully-implemented system that runs in O(KN log N) time in the number of extractions, N, and the maximum number of synonyms per word, K, and introduces a probabilistic relational model for predicting whether two strings are co-referential based on the similarity of the assertions containing them. Expand
Canonicalizing Open Knowledge Bases
TLDR
This paper presents an approach based on machine learning methods that can canonicalize such Open IE triples, by clustering synonymous names and phrases, thus shedding light on the middle ground between "open" and "closed" information extraction systems. Expand
Open Language Learning for Information Extraction
Open Information Extraction (IE) systems extract relational tuples from text, without requiring a pre-specified vocabulary, by identifying relation phrases and associated arguments in arbitraryExpand
RELLY: Inferring Hypernym Relationships Between Relational Phrases
TLDR
This work presents a new general-purpose method, RELLY, for constructing a large hypernymy graph of relational phrases with high-quality subsumptions using collective probabilistic programming techniques, and demonstrates the value of this resource for a document-relevance ranking task. Expand
Reasoning With Neural Tensor Networks for Knowledge Base Completion
TLDR
An expressive neural tensor network suitable for reasoning over relationships between two entities given a subset of the knowledge base is introduced and performance can be improved when entities are represented as an average of their constituting word vectors. Expand
Open Information Extraction to KBP Relations in 3 Hours
We participated in both the English Slot Filling and Entity Linking in the 2013 TAC-KBP evaluation. Our Slot Filling system provides an answer to the following conjectures: Can Open InformationExpand
Relation Extraction with Matrix Factorization and Universal Schemas
TLDR
This work presents matrix factorization models that learn latent feature vectors for entity tuples and relations that achieve substantially higher accuracy than a traditional classification approach and is able to reason about unstructured and structured data in mutually-supporting ways. Expand
...
1
2
3
4
...