Learn More
Transformation-based model adaptation techniques like maximum likelihood linear regression (MLLR) rely on an accurate selection of the number of transformations for a given amount of adaptation data. If too many transformations are used, the transformation parameters may be poorly estimated, can overfit the adaptation data, and offer poor generalization. On(More)
In the past few years, transformation-based model adaptation techniques have been widely used to help reducing acoustic mismatch between training and testing conditions of automatic speech recognizers. The estimation of the transformation parameters is usually carried out using estimation paradigms based on classical statistics such as maximum likelihood,(More)
We are interested in retrieving information from speech data like broadcast news, telephone conversations and roundtable meetings. Today, most systems use large vocabulary continuous speech recognition tools to produce word transcripts; the transcripts are indexed and query terms are retrieved from the index. However, query terms that are not part of the(More)
Classical audio retrieval techniques consist in transcribing audio documents using a large vocabulary speech recognition system and indexing the resulting transcripts. However, queries that are not part of the recognizer's vocabulary or have a large probability of getting mis-recognized can significantly impair the performance of the retrieval system.(More)
We propose a Minimum Verification Error (MVE) training scenario to design and adapt an HMM-based speaker verification system. By using the discriminative training paradigm, we show that customer and background models can be jointly estimated so that the expected number of verification errors (false accept and false reject) on the training corpus are(More)
An auditory feature extraction algorithm for robust speech recognition in adverse acoustic environments is proposed. Based on the analysis of human auditory system, the feature extraction algorithm consists of several modules: FFT, outer-middle-ear transfer function, frequency conversion from linear to Bark scales, auditory filtering, nonlinearity, and(More)
In this paper we introduce an approach to transformation based model adaptation techniques. Previously published schemes like MLLR define a set of affine transformations to be applied on clusters of model parameters. Although it has been shown that this approach can yield good results when adaptation data is scarce, an inherent problem needs to be(More)
Building multiple automatic speech recognition (ASR) systems and combining their outputs using voting techniques such as ROVER is an effective technique for lowering the overall word error rate. A successful system combination approach requires the construction of multiple systems with complementary errors, or the combination will not outperform any of the(More)