• Publications
  • Influence
OntoNotes: The 90% Solution
TLDR
It is described the OntoNotes methodology and its result, a large multilingual richly-annotated corpus constructed at 90% interannotator agreement, which will be made available to the community during 2007.
The Automatic Content Extraction (ACE) Program - Tasks, Data, and Evaluation
The objective of the ACE program is to develop technology to automatically infer from human language data the entities being mentioned, the relations among these entities that are directly expressed,
A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION
We define a new, intuitive measure for evaluating machine translation output that avoids the knowledge intensiveness of more meaning-based approaches, and the labor-intensiveness of human judgments.
PERFORMANCE MEASURES FOR INFORMATION EXTRACTION
TLDR
An error measure is defined, the slot error rate, which combines the different types of error directly, without having to resort to precision and recall as preliminary measures.
An Algorithm that Learns What's in a Name
TLDR
IdentiFinderTM, a hidden Markov model that learns to recognize and classify names, dates, times, and numerical quantities, is evaluated and is competitive with approaches based on handcrafted rules on mixed case text and superior on text where case information is not available.
Nymble: a High-Performance Learning Name-finder
This paper presents a statistical, learned approach to finding names and other nonrecursive entities in text (as per the MUC-6 definition of the NE task), using a variant of the standard hidden
CoNLL-2011 Shared Task: Modeling Unrestricted Coreference in OntoNotes
TLDR
The CoNLL-2011 shared task involved predicting coreference using OntoNotes data, a new resource that provides multiple integrated annotation layers (parses, semantic roles, word senses, named entities and coreference) that could support joint models.
A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model
TLDR
A novel string-todependency algorithm for statistical machine translation that employs a target dependency language model during decoding to exploit long distance word relations, which are unavailable with a traditional n-gram language model.
Plan-And-Write: Towards Better Automatic Storytelling
TLDR
Experiments show that with explicit storyline planning, the generated stories are more diverse, coherent, and on topic than those generated without creating a full plan, according to both automatic and human evaluations.
...
1
2
3
4
5
...