Learn More
A method of speaker adaptation for continuous density hidden Markov models (HMMs) is presented. An initial speaker-independent system is adapted to improve the modelling of a new speaker by updating the HMM parameters. Statistics are gathered from the available adaptation data and used to calculate a linear regression-based transformation for the mean(More)
In this paper we introduce the Minimum Phone Error (MPE) and Minimum Word Error (MWE) criteria for the discriminative training of HMM systems. The MPE/MWE criteria are smoothed approximations to the phone or word error rate respectively. We also discuss I-smoothing which is a novel technique for smoothing discriminative training criteria using statistics(More)
One of the key issues for adaptation algorithms is to modify a large number of parameters with only a small amount of adaptation data. Speaker adaptation techniques try to obtain near speaker dependent (SD) performance with only small amounts of speaker speciic data, and are often based on initial speaker independent (SI) recognition systems. Some of these(More)
The key problem to be faced when building a HMM-based continuous speech recogniser is maintaining the balance between model complexity and available training data. For large vocabulary systems requiring crossword context dependent modelling, this is particularly acute since many such contexts will never occur in the training data. This paper describes a(More)
The maximum likelihood linear regression (MLLR) approach for speaker adaptation of continuous density mixture Gaus-sian HMMs is presented and its application to static and in-cremental adaptation for both supervised and unsupervised modes described. The approach involves computing a transformation for the mixture component means using linear regression. To(More)
This paper describes a framework for optimising the structure and parameters of a continuous density HMM-based large Ž. vocabulary recognition system using the Maximum Mutual Information Estimation MMIE criterion. To reduce the computational complexity of the MMIE training algorithm, confusable segments of speech are identified and stored as word lattices(More)