• Publications
  • Influence
The DET curve in assessment of detection task performance
TLDR
We introduce the DET Curve as a means of representing performance on detection tasks that involve a tradeoff of error types. Expand
Vocal tract normalization in speech recognition: Compensating for systematic speaker variability
The performance of speech recognition systems is often improved by accounting explicitly for sources of variability in the data. In the SWITCHBOARD corpus, studied during the 1994 CAIP workshopExpand
Initial evaluation of hidden dynamic models on conversational speech
TLDR
Hidden dynamic models (HDMs) attempt to automatically learn spectral targets in a hidden feature space using models that integrate linguistic information with constrained temporal trajectory models. Expand
CASS: a phonetically transcribed corpus of mandarin spontaneous speech
TLDR
A collection of Chinese spoken language has been collected and phonetically annotated to capture spontaneous speech and language effects. Expand
Selective sampling of training data for speech recognition
TLDR
We propose an iterative training selection algorithm to improve speech recognition by automatically selecting a subset of available humanly transcribed training data, thereby improving error rates without incurring additional transcription cost. Expand
Automatic selection of transcribed training material
TLDR
We propose an iterative training algorithm that seeks to improve the error rate of a speech recognizer without incurring additional transcription cost, by selecting a subset of the already available transcribed training data. Expand
Automatic generation of pronunciation lexicons for Mandarin spontaneous speech
TLDR
Pronunciation modeling for large vocabulary speech recognition attempts to improve recognition accuracy by identifying pronunciations that are not in the ASR systems pronunciation lexicon. Expand
Robustness aspects of active learning for acoustic modeling
TLDR
We demonstrate robustness to seven initial conditions, showing that we can select around 20 hours of training data and achieve a range of error rates between 8.6% and 9.3%, compared to using all of the given training data. Expand
Active learning for acoustic speech recognition modeling
In this work, we investigate a machine learning approach to cost-effectively train acoustic models for speech recognition. More specifically, we utilize an active learning method that allows theExpand
...
1
2
...