Extracting meronyms for a biology knowledge base using distant supervision

  title={Extracting meronyms for a biology knowledge base using distant supervision},
  author={Xiao Ling and Peter Clark and Daniel S. Weld},
  booktitle={AKBC '13},
Knowledge of objects and their parts, meronym relations, are at the heart of many question-answering systems, but manually encoding these facts is impractical. [] Key Method We introduce a novel algorithm, generalizing the ``at least one'' assumption of multi-instance learning to handle the case where a fixed (but unknown) percentage of bag members are positive examples. Detailed experiments compare strategies for mention detection, negative example generation, leveraging out-of-domain meronyms, and evaluate…

Figures and Tables from this paper

Type-Aware Distantly Supervised Relation Extraction with Linked Arguments

Four orthogonal improvements are investigated: integrating named entity linking (NEL) and coreference resolution into argument identification for training and extraction, enforcing type constraints of linked arguments, and partitioning the model by relation type signature.

Detecting Biomedical Relations using Distant Supervision

The main goal is the investigation of whether UMLS is suitable to be used to label data automatically so as to detect similar information in natural language, and a method to reduce falsely labelled instances in the automatically generated data is presented and found to improve the detection of relationships.

Disambiguation for Semi-Supervised Extraction of Complex Relations in Large Commonsense Knowledge Bases

Preliminary results show that these methods can extract complex relations from text with good accuracy and show how a large Web-scale corpus could be used with the Cyc knowledge base to aid in disambiguation tasks.

Do Dogs have Whiskers? A New Knowledge Base of hasPart Relations

A new knowledge-base of hasPart relationships, extracted from a large corpus of generic statements, is presented, which is all three of: accurate (90% precision), salient, salient, and has high coverage of common terms.

Commonsense in Parts: Mining Part-Whole Relations from the Web and Image Tags

A new method for automatically acquiring part-whole commonsense from Web contents and image tags at an unprecedented scale, yielding many millions of assertions, while specifically addressing the four shortcomings of prior work.

Acquisition of Turkish meronym based on classification of patterns

This paper provides semi-automatic pattern-based extraction of part–whole relations from a raw text by utilizing and adopting some lexico-syntactic patterns to disclose meronymy relation from a Turkish corpus.

A Case Study in Bootstrapping Ontology Graphs from Textbooks

This paper addresses the question: to what extent can automated extraction and crowd sourcing techniques be combined to bootstrap the creation of comprehensive and accurate ontology graphs by adapting the state-of-the-art language model BERT to this task.

A study of the knowledge base requirements for passing an elementary science test

The analysis suggests that as well as fact extraction from text and statistically driven rule extraction, three other styles of automatic knowledge base construction (AKBC) would be useful: acquiring definitional knowledge, direct 'reading' of rules from texts that state them, and, given a particular representational framework, acquisition of specific instances of those models from text.

Bootstrapping Ontology Graphs

  • Computer Science
  • 2021
This paper addresses the question: to what extent can automated extraction and crowd sourcing techniques be combined to bootstrap the creation of comprehensive and accurate ontology graphs by adapting the state-of-the-art language model BERT to this task.



Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

A novel approach for multi-instance learning with overlapping relations that combines a sentence-level extraction model with a simple, corpus-level component for aggregating the individual facts is presented.

Modeling Relations and Their Mentions without Labeled Text

A novel approach to distant supervision that can alleviate the problem of noisy patterns that hurt precision by using a factor graph and applying constraint-driven semi-supervision to train this model without any knowledge about which sentences express the relations in the authors' training KB.

Multi-instance Multi-label Learning for Relation Extraction

This work proposes a novel approach to multi-instance multi-label learning for RE, which jointly models all the instances of a pair of entities in text and all their labels using a graphical model with latent variables that performs competitively on two difficult domains.

Distant supervision for relation extraction without labeled data

This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size.

Semantic Taxonomy Induction from Heterogenous Evidence

This work proposes a novel algorithm for inducing semantic taxonomies that flexibly incorporates evidence from multiple classifiers over heterogenous relationships to optimize the entire structure of the taxonomy, using knowledge of a word's coordinate terms to help in determining its hypernyms, and vice versa.

Learning to Extract Relations from the Web using Minimal Supervision

An existing relation extraction method is extended to handle this weaker form of supervision, and experimental results demonstrate that the approach can reliably extract relations from web documents.

On Learning Subtypes of the Part-Whole Relation: Do Not Mix Your Seeds

It is shown that the traditional practice of initializing minimally-supervised algorithms with a single set that mixes seeds of different types fails to capture the wide variety of part-whole patterns and tuples.

Constructing Biological Knowledge Bases by Extracting Information from Text Sources

A research effort aimed at automatically mapping information from text sources into structured representations, such as knowledge bases, is begun, to use machine-learning methods to induce routines for extracting facts from text.

Learning Semantic Constraints for the Automatic Discovery of Part-Whole Relations

This paper presents a method and its results for learning semantic constraints to detect part-whole relations and the targeted part-Whole relations were detected with an accuracy of 83%.

Automatic Acquisition of Hyponyms from Large Text Corpora

A set of lexico-syntactic patterns that are easily recognizable, that occur frequently and across text genre boundaries, and that indisputably indicate the lexical relation of interest are identified.