A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts

@inproceedings{Thelen2002ABM,
  title={A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts},
  author={Michael Thelen and Ellen Riloff},
  booktitle={EMNLP},
  year={2002}
}
This paper describes a bootstrapping algorithm called Basilisk that learns high-quality semantic lexicons for multiple categories. Basilisk begins with an unannotated corpus and seed words for each semantic category, which are then bootstrapped to learn new words for each category. Basilisk hypothesizes the semantic class of a word based on collective information over a large body of extraction pattern contexts. We evaluate Basilisk on six semantic categories. The semantic lexicons produced by… Expand
Bootstrapping a Semantic Lexicon on Verb Similarities
TLDR
This work presents a bootstrapping algorithm to create a semantic lexicon from a list of seed words and a corpus that was mined from the web, and finds that verbs that are highly domain related achieved the highest accuracy. Expand
Mutual Screening Graph Algorithm: A New Bootstrapping Algorithm for Lexical Acquisition
TLDR
A new bootstrapping algorithm called Mutual Screening Graph Algorithm (MSGA) to learn semantic lexicons that uses only unannotated corpus and a few of seed words to learn new words for each semantic category by changing the format of extracted patterns and the method for scoring patterns and words. Expand
Building a Semantic Lexicon of English Nouns via Bootstrapping
TLDR
The use of a weakly supervised bootstrapping algorithm in discovering contrasting semantic categories from a source lexicon with little training data is described, showing that such automatically categorized terms tend to agree with human judgements. Expand
Unsupervised Discovery of Negative Categories in Lexicon Bootstrapping
TLDR
NEG-FINDER is presented, the first approach for discovering negative categories automatically, and effectively removes the necessity of manual intervention and formulation of negative categories, with performance closely approaching that obtained using negative categories defined by a domain expert. Expand
AutoEncoder Guided Bootstrapping of Semantic Lexicon
TLDR
This work improves Basilisk by modifying its two scoring functions, and incorporates AutoEncoder to the scoring functions of patterns and candidates to reduce the bias problems and obtain more balanced results. Expand
Learning Semantic Lexicons Using Graph Mutual Reinforcement Based Bootstrapping
TLDR
Experimental results show that the GMR-based bootstrapping approach outperforms the existing algorithms both in in- domain data and out-domain data and that the result depends on not only the size of the corpus but also the quality. Expand
Ensemble-based Semantic Lexicon Induction for Semantic Tagging
TLDR
An ensemble-based framework for semantic lexicon induction that incorporates three diverse approaches for semantic class identification that outperforms individual methods in terms of both lexicon quality and instance-based semantic tagging is presented. Expand
Corpus-based Semantic Lexicon Induction with Web-based Corroboration
TLDR
This research uses a weakly supervised bootstrapping algorithm to induce a semantic lexicon from a text corpus, and then issue Web queries to generate co-occurrence statistics between each lexicon entry and semantically related terms. Expand
An unsupervised method for lexical acquisition based on Bootstrapping
  • Yuhan Zhang, Yanquan Zhou
  • Computer Science
  • 2009 International Conference on Natural Language Processing and Knowledge Engineering
  • 2009
TLDR
This paper presents an unsupervised method called Mutual Screening Graph Algorithm based on Bootstrapping (MSGA-Bootstrapping) for lexical acquisition, and shows that MSGA can outperform previous bootstrapping algorithm Basilisk and GMR (Graph Mutual Reinforcement based Bootstrapped). Expand
Combining Contexts in Lexicon Learning for Semantic Parsing
TLDR
A method for the automatic construction of noun entries in a semantic lexicon by modifying adjective, verb-deep-subject and verbdeep-object yields very high precision for most semantic features, giving rise to the fully automatic incorporation into the lexicon. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 22 REFERENCES
A Corpus-Based Approach for Building Semantic Lexicons
TLDR
This paper presents a corpus-based method that can be used to build semantic lexicons for specific categories using a small set of seed words for a category and a representative text corpus. Expand
Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping
TLDR
A multilevel bootstrapping algorithm is presented that generates both the semantic lexicon and extraction patterns simultaneously simultaneously and produces high-quality dictionaries for several semantic categories. Expand
Noun-Phrase Co-Occurence Statistics for Semi-Automatic Semantic Lexicon Construction
TLDR
This paper presents an algorithm for extracting potential entries for a category from an on-line corpus, based upon a small set of exemplars, that could be viewed as an "enhancer" of existing broad-coverage resources. Expand
Noun-phrase co-occurrence statistics for semi-automatic semantic lexicon construction
TLDR
This paper presents an algorithm for extracting potential entries for a category from an on-line corpus, based upon a small set of exemplars, that could be viewed as an ``enhancer'' of existing broad-coverage resources. Expand
An Empirical Approach to Conceptual Case Frame Acquisition
TLDR
A corpus-based algorithm for acquiring conceptual case frames empirically from unannotated text that learns semantic preferences for each extraction pattern and merges the syntactically compatible patterns to produce multi-slot case frames with selectional restrictions. Expand
Automatic Acquisition of Hyponyms from Large Text Corpora
TLDR
A set of lexico-syntactic patterns that are easily recognizable, that occur frequently and across text genre boundaries, and that indisputably indicate the lexical relation of interest are identified. Expand
CRYSTAL: Inducing a Conceptual Dictionary
TLDR
CRYSTAL is described, a system which automatically induces a dictionary of "concept-node definitions" sufficient to identify relevant information from a training corpus that can often surpass human intuitions in creating reliable extraction rules. Expand
Automatic construction of a hypernym-labeled noun hierarchy from text
TLDR
This work goes a step further by automatically creating not just clusters of related words, but a hierarchy of nouns and their hypernyms, akin to the hand-built hierarchy in WordNet. Expand
Automatically Generating Extraction Patterns from Untagged Text
  • E. Riloff
  • Computer Science
  • AAAI/IAAI, Vol. 2
  • 1996
TLDR
This work has developed a system called AutoSlog-TS that creates dictionaries of extraction patterns using only untagged text, and in experiments with the MUG-4 terrorism domain, created a dictionary of extraction pattern that performed comparably to a dictionary created by autoSlog, using only preclassified texts as input. Expand
A method for disambiguating word senses in a large corpus
TLDR
The proposed method was designed to disambiguate senses that are usually associated with different topics using a Bayesian argument that has been applied successfully in related tasks such as author identification and information retrieval. Expand
...
1
2
3
...