Learn More
—In this paper, we present several confidence measures for large vocabulary continuous speech recognition. We propose to estimate the confidence of a hypothesized word directly as its posterior probability, given all acoustic observations of the utterance. These probabilities are computed on word graphs using a for-ward–backward algorithm. We also study the(More)
Estimates of confidence for the output of a speech recognition system can be used in many practical applications of speech recognition technology. They can be employed for detecting possible errors and can help to avoid undesirable verification turns in automatic inquiry systems. In this paper we propose to estimate the confidence in a hypothesized word as(More)
In this paper, we introduce a new concept, the time frame error rate. We show that this error rate is closely correlated with the word error rate and use it to overcome the mismatch between Bayes' decision rule which aims at minimizing the expected sentence error rate and the word error rate which is used to assess the performance of speech recognition(More)
—For large vocabulary continuous speech recognition systems, the amount of acoustic training data is of crucial importance. In the past, large amounts of speech were thus recorded from various sources and had to be transcribed manually. It is thus desirable to train a recognizer with as little manually transcribed acoustic data as possible. Since(More)
In this paper we present and compare several confidence measures for large vocabulary continuous speech recognition. We show that posterior word probabilities computed on word graphs and N-best lists clearly outperform non-probabilistic confidence measures, e.g. the acoustic stability and the hypothesis density. In addition, we prove that the estimation of(More)
In this paper, we present an overview of the RWTH Aachen large vocabulary continuous speech recognizer. The recog-nizer is based on continuous density hidden Markov models and a time-synchronous left-to-right beam search strategy. Experimental results on the ARPA Wall Street Journal (WSJ) corpus verify the effects of several system components , namely(More)
In this paper we present a new scoring scheme for speech recognition. Instead of using the joint probability of a word sequence and a sequence of acoustic observations, we determine the best path through a word graph using posterior word probabilities. These probabilities are computed beforehand with a modiied forward-backward algorithm. It is important to(More)
The use of dialogue-state dependent language models in automatic inquiry systems can improve speech recognition and understanding if a reasonable prediction of the dialogue state is feasible. In this paper, the dialogue state is defined as the set of parameters which are contained in the system prompt. For each dialogue state a separate language model is(More)
Automatic recognition of conversational speech tends to have higher word error rates (WER) than read speech. Improvements gained from unsupervised speaker adaptation methods like Maximum Likelihood Linear Regression (MLLR) [1] are reduced because of their sensitivity to recognition errors in the first pass. We show that a more detailed modeling of(More)