Can We Make Information Extraction More Adaptive?

  title={Can We Make Information Extraction More Adaptive?},
  author={Yorick Wilks and Roberta Catizone},
It seems widely agreed that IE (Information Extraction) is now a tested language technology that has reached precision+recall values that put it in about the same position as Information Retrieval and Machine Translation , both of which are widely used commercially. There is also a clear range of practical applications that would be eased by the sort of template-style data that IE provides. The problem for wider deployment of the technology is adaptability: the ability to customize IE rapidly… 
Automatic Multi-Lingual Information Extraction
This thesis intends to implement a self-contained, language independent automatic IE system that deals with Chinese and English languages name entity recognition and entity relation extraction, and extends maximum entropy models to incorporate unlabeled data for general information extraction.
A study on question answering system using integrated retrieval method
A fundamental start originates from the usability and limitation of Question Answering.
Aspects of Text Mining From Computational Semiotics to Systemic Functional Hypertexts
The specific complexity of this class of signs is outlined and requirements for the implementation of text mining procedures are derived with reference to text linkage introduced as a special task in text mining.
A Study on the Centrality of Relevance for the problem of Automated Question Answering
An overview of literature concerning Question Answering systems is given and the concept of relevance is identified as central to the problem of question answering. The idea of relevance is
Employing Information Extraction for Building Mobile Applications
We describe a SMS-based information system called CATS, which allows posting and searching through free Arabic text using Information Extraction (IE) technology. We discuss the challenges of applying
An Evolutionary Genre-based and Domain-independent Approach for High-level Knowledge Discovery from Texts
The underlying model which integrates Information Extraction and evolutionary techniques to produce interesting and novel hypotheses rules is described and novel criteria to evaluate the roles’ performance in terms of interestingness and novelty are proposed.
Einleitung: Perspektiven und Positionen des Text Mining
Mit der Einführung der Neuen Medien wächst die Bedeutung digitalisierter Dokumente als Primärmedium für die Verarbeitung, Verbreitung and Verwaltung of Information in öffentlichen and betrieblichen Organisationen.


Information Extraction: Beyond Document Retrieval
In this paper we give a synoptic view of the growth of the text processing technology of information extraction (IE) whose function is to extract information about a pre‐specified set of entities,
Information Extraction: Techniques and Challenges
This volume takes a broad view of information extraction as any method for ltering information from large volumes of text. This includes the retrieval of documents from collections and the tagging of
An empirical approach to Lexical Tuning
A model of "lexical tuning" {the systematic adaptation of a lexicon to a corpus| that specializes the set of verb senses required for an NLP application, and builds inductively the corresponding lexical descriptions for those senses are described.
Mixed-Initiative Development of Language Processing Systems
A new set of integrated tools, collectively called the Alembic Workbench, that uses a mixed-initiative approach to "bootstrapping" the manual tagging process, with the goal of reducing the overhead associated with corpus development.
Automatically Constructing a Dictionary for Information Extraction Tasks
Using AutoSlog, a system that automatically builds a domain-specific dictionary of concepts for extracting information from text, a dictionary for the domain of terrorist event descriptions was constructed in only 5 person-hours and the overall scores were virtually indistinguishable.
Automatically Acquiring Conceptual Patterns without an Annotated Corpus
It is proposed that conceptual patterns for information extraction can be acquired automatically using only a preclassified training corpus and no text annotations, and a system called AutoSlog-TS is described, which is a variation of the previous AutoS Log system that runs exhaustively on an untagged text corpus.
Extracting pragmatic content from Email
Results concerning the large scale automatic extraction of pragmatic content from Email, by a system based on a phrase matching approach to Speech Act detection combined with the empirical detection of Speech Act patterns in corpora are presented.
Lexical ambiguity and information retrieval
An analysis of lexical ambiguity in information retrieval test collections and experiments to determine the utility of word meanings for separating relevant from nonrelevant documents show that there is considerable ambiguity even in a specialized database.
Evaluating Natural Language Processing Techniques in Information Retrieval
This chapter reports on the progress of the Natural Language Information retrieval project, a joint effort of several sites led by GE Research, and its evaluation in a series of Text Retrieval Conferences conducted since 1992 under the auspices of the National Institute of Standards and Technology (NIST) and the Defense Advanced Research Projects Agency (DARPA).
Genus Disambiguation: A Study in Weighted Preference
A series of experiments are reported which weight the three factors in various ways, and improvements to the algorithm are described, to about 90% accuracy.