Corpus ID: 10779824

Automatically Acquiring Conceptual Patterns without an Annotated Corpus

@inproceedings{Riloff1995AutomaticallyAC,
  title={Automatically Acquiring Conceptual Patterns without an Annotated Corpus},
  author={Ellen Riloff and Jay Shoen},
  booktitle={VLC@ACL},
  year={1995}
}
Previous work on automated dictionary construction for information extraction has relied on annotated text corpora. However, annotating a corpus is time-consuming and difficult. We propose that conceptual patterns for information extraction can be acquired automatically using only a preclassified training corpus and no text annotations. We describe a system called AutoSlog-TS, which is a variation of our previous AutoSlog system, that runs exhaustively on an untagged text corpus. Text… Expand
Automatically Generating Extraction Patterns from Untagged Text
  • E. Riloff
  • Computer Science
  • AAAI/IAAI, Vol. 2
  • 1996
TLDR
This work has developed a system called AutoSlog-TS that creates dictionaries of extraction patterns using only untagged text, and in experiments with the MUG-4 terrorism domain, created a dictionary of extraction pattern that performed comparably to a dictionary created by autoSlog, using only preclassified texts as input. Expand
An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains
  • E. Riloff
  • Computer Science, Medicine
  • Artif. Intell.
  • 1996
TLDR
The performance of AutoSlog across the three domains is compared, the lessons learned about the generality of this approach are discussed, and results from two experiments which demonstrate that novice users can generate effective dictionaries using autoSlog are presented. Expand
A portable method for acquiring information extraction patterns without annotated corpora
TLDR
SSENCE is described, a new method for acquiring IE patterns that significantly reduces the need for human intervention and uses a general purpose ontology and widely applied syntactic tools to do so. Expand
Acquiring information extraction patterns from unannotated corpora
TLDR
A novel method for acquiring IE patterns, Essence, that significantly reduces the need for human intervention and reduces the expert effort required to build an IE system and therefore also reduces the effort of porting the method to any domain is presented. Expand
Using learned extraction patterns for text classification
  • E. Riloff
  • Computer Science
  • Learning for Natural Language Processing
  • 1995
TLDR
A series of experiments are described that show how the extraction patterns learned by AutoSlog can be used for text classification, and three dictionaries produced by Auto Slog for different domains performed well in these experiments. Expand
Toward General-Purpose Learning for Information Extraction
TLDR
SRV is described, a learning architecture for information extraction which is designed for maximum generality and flexibility and can exploit domain-specific information, including linguistic syntax and lexical information, in the form of features provided to the system explicitly as input for training. Expand
Machine Learning for InformationExtraction from Online
The eld of information extraction (IE) is concerned with applying natural language processing (NLP) to extract essential details from text documents automatically. Recent results have demonstratedExpand
ESSENCE: a portable methodology for building information extraction systems
TLDR
This work presents a methodology to automatically learn information extraction patterns from unrestricted text corpus representative of the domain and includes the use of the lexical knowledge along with the lexico-semantic relations from WordNet for the generalization process. Expand
Pattern Construction for Extracting Domain Terminology
TLDR
This article deals with a methodology for automatic obtaining patterns (Basic Patterns and Definitory Verbal Patterns) for extracting domain terminology and minimizing the manual work of the experts. Expand
Automatic Template Creation for Information Extraction, an Overview
TLDR
The approach will carry out a corpus-based analysis of task-relevant documents, identifying and analysing the interaction between the fundamental elements, and defines semantic relationships which will be necessary to identify and categorise these fundamental elements. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 19 REFERENCES
Automatically Constructing a Dictionary for Information Extraction Tasks
TLDR
Using AutoSlog, a system that automatically builds a domain-specific dictionary of concepts for extracting information from text, a dictionary for the domain of terrorist event descriptions was constructed in only 5 person-hours and the overall scores were virtually indistinguishable. Expand
Information extraction as a basis for portable text classification systems
TLDR
This dissertation addresses the knowledge-engineering bottleneck for a natural language processing task called "information extraction" and presents a system called AutoSlog, which automatically constructs dictionaries for information extraction, given an appropriate training corpus. Expand
Information extraction as a basis for high-precision text classification
TLDR
An approach to text classification that represents a compromise between traditional word-based techniques and in-depth natural language processing and an automated method for empirically deriving appropriate threshold values is described. Expand
Acquiring Lexical Knowledge from Text: A Case Study
TLDR
This paper describes an approach to constructing new lexical entries in a gradual process by analyzing a sequence of example texts to permit the graceful tolerance of new words while enabling the automated extension of the lexicon. Expand
Acquisition of semantic patterns for information extraction from corpora
  • Juntae Kim, D. Moldovan
  • Computer Science
  • Proceedings of 9th IEEE Conference on Artificial Intelligence for Applications
  • 1993
TLDR
A knowledge acquisition tool to extract semantic patterns for a memory-based information retrieval system is presented to facilitate the construction of a large knowledge base of semantic patterns. Expand
Coping with Ambiguity and Unknown Words through Probabilistic Models
TLDR
A new natural language system (PLUM) is constructed for extracting data from text, e.g., newswire text, based on results of experiments in predicting parts of speech of highly ambiguous words, predicting the intended interpretation of an utterance when more than one interpretation satisfies all known syntactic and semantic constraints. Expand
Building a Large Annotated Corpus of English: The Penn Treebank
TLDR
As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus. Expand
Symbolic/Subsymbolic Sentence Analysi: Exploiting the Best of Two Worlds
TLDR
A general introduction to `CIRCUS' is provided, the OPPORTUNities for DIFFERENT KINDS of MEMORY INTERACTIONS with `cIRCus' are presented, and detailed descriptions of the authors' marker passing and NUMERICAL RELAXATION ALGORITHMS are provided. Expand
A stochastic parts program and noun phrase parser for unrestricted text
  • Kenneth Ward Church
  • Computer Science
  • International Conference on Acoustics, Speech, and Signal Processing,
  • 1989
TLDR
A program that tags each word in an input sentence with the most likely part of speech has been written and performance is encouraging; a 400-word sample is presented and is judged to be 99.5% correct. Expand
Towards a Self-Extending Parser
TLDR
This paper discusses an approach to incremental learning in natural language processing by projecting and integrating semantic constraints to learn word definitions as implemented in the POLITICS system. Expand
...
1
2
...