George R. Doddington

Learn More
Topic Detection and Tracking (TDT) is a DARPA-sponsored initiative to investigate the state of the art in finding and following new events in a stream of broadcast news stories. The TDT problem consists of three major tasks: (1) segmenting a stream of data, especially recognized speech, into distinct stories; (2) identifying those news stories that are the(More)
We introduce the DET Curve as a means of representing performance on detection tasks that involve a tradeoff of error types. We discuss why we prefer it to the traditional ROC Curve and offer several examples of its use in speaker recognition and language recognition. We explain why it is likely to produce approximately linear curves. We also note special(More)
The objective of the ACE program is to develop technology to automatically infer from human language data the entities being mentioned, the relations among these entities that are directly expressed, and the events in which these entities participate. Data sources include audio and image data in addition to pure text, and Arabic and Chinese in addition to(More)
Performance variability in speech and speaker recognition systems can be attributed to many factors. One major factor, which is often acknowledged but seldom analyzed, is inherent differences in the recognizability of different speakers. In speaker recognition systems such differences are characterized by the use of animal names for different types of(More)
This paper, based on three presentations made in 1998 at the RLA2C Workshop in Avignon, discusses the evaluation of speaker recognition systems from several perspectives. A general discussion of the speaker recognition task and the challenges and issues involved in its evaluation is o€ered. The NIST evaluations in this area and speci®cally the 1998(More)
Speech research has made tremendous progress in the past using the following paradigm: de ne the research problem, collect a corpus to objectively measure progress, and solve the research problem. Natural language research, on the other hand, has typically progressed without the bene t of any corpus of data with which to test research hypotheses. We(More)
“Familiar” speaker information is explored using non-acoustic features in NIST’s new “extended data” speaker detection task.[1] Word unigrams and bigrams, used in a traditional target/background likelihood ratio framework, are shown to give surprisingly good performance. Performance continues to improve with additional training and/or test data. Bigram(More)
Most large vocabulary continuous speech recognition (LVCSR) systems in the past decade have used a context-dependent phone as the fundamental acoustic unit. In this paper, we present one of the first robust LVCSR systems that uses a syllable-level acoustic unit for LVCSR on telephone-bandwidth speech. This effort is motivated by the inherent limitations in(More)