Learn More
In this paper a framework for maximum a posteriori (MAP) estimation of hidden Markov models (HMM) is presented. Three key issues of MAP estimation, namely the choice of prior distribution family, the specification of the parameters of prior densities and the evaluation of the MAP estimates, are addressed. Using HMMs with Gaussian mixture state observation(More)
A critical component in the pattern matching approach to speech recognition is the training algorithm, which aims at producing typical (reference) patterns or models for accurate pattern comparison. In this paper, we discuss the issue of speech recognizer training from a broad perspective with root in the classical Bayes decision theory. We differentiate(More)
It is generally agreed that, for a given speech recognition task, a speaker-dependent system usually outperforms a speaker-independent system, as long as a sufficient amount of training data is available. When the amount of speaker-specific training data is limited, however, such a performance gain is not guaranteed. One way to improve the performance is to(More)
Transformation-based model adaptation techniques like maximum likelihood linear regression (MLLR) rely on an accurate selection of the number of transformations for a given amount of adaptation data. If too many transformations are used, the transformation parameters may be poorly estimated, can overfit the adaptation data, and offer poor generalization. On(More)
We propose a novel approach to automatic spoken language identification (LID) based on vector space modeling (VSM). It is assumed that the overall sound characteristics of all spoken languages can be covered by a universal collection of acoustic units, which can be characterized by the acoustic segment models (ASMs). A spoken utterance is then decoded into(More)
We present a framework of quasi-Bayes (QB) learning of the parameters of the continuous density hidden Markov model (CDHMM) with Gaussian mixture state observation densities. The QB formulation is based on the theory of recursive Bayesian inference. The QB algorithm is designed to incrementally update the hyperparameters of the approximate posterior(More)
In the past few years, transformation-based model adaptation techniques have been widely used to help reducing acoustic mismatch between training and testing conditions of automatic speech recognizers. The estimation of the transformation parameters is usually carried out using estimation paradigms based on classical statistics such as maximum likelihood,(More)
This letter presents a regression-based speech enhancement framework using deep neural networks (DNNs) with a multiple-layer deep architecture. In the DNN learning process, a large training set ensures a powerful modeling capability to estimate the complicated nonlinear mapping from observed noisy speech to desired clean signals. Acoustic context was found(More)