Heikki Kallasjoki

Learn More
This work presents an automatic speech recognition system which uses a missing data approach to compensate for environmental noise. The missing, noise-corrupted components are identified using binaural features or a support vector machine (SVM) classifier. To perform speech recognition using the partially observed data, the missing components are(More)
The problem of reverberation in speech recognition is addressed in this study by extending a noise-robust feature enhancement method based on non-negative matrix factorization. The signal model of the observation as a linear combination of sample spectrograms is augmented by a melspectral feature domain convolution to account for the effects of room(More)
Following earlier work, we modify linear predictive (LP) speech analysis by including temporal weighting of the squared prediction error in the model optimization. In order to focus this so called weighted LP model on the least noisy signal regions in the presence of stationary additive noise, we use shorttime signal energy as the weighting function. We(More)
We present an automatic speech recognition system that uses a missing data approach to compensate for challenging environmental noise containing both additive and convolutive components. The unreliable and noisecorrupted (“missing”) components are identified using a Gaussian mixture model (GMM) classifier based on a diverse range of acoustic features. To(More)
We present a method of improving automatic speech recognition performance under noisy conditions by using a source separation approach to extract the underlying clean speech signal. The feature enhancement processing is complemented with heuristic estimates of the uncertainty of the source separation, that are used to further assist the recognition. The(More)
This paper addresses dereverberation of speech using an unsupervised approach utilizing speech prior and taking only weak assumptions on reverberation. Our approach uses a long time context representation of reverberated speech in spectral-temporal supervectors which are decorrelated by the PCA. In the decorrelated domain supervectors are mapped from(More)
This work studies the use of observation uncertainty measures for improving the speech recognition performance of an exemplar-based source separation based front end. To generate the observation uncertainty estimates for the enhanced features, we propose the use of heuristic methods based on the sparse representation of the noisy signal in the(More)
ABSTRACT In this work, we present a missing feature reconstruction based automatic speech recognition (ASR) system in which masks are estimated by binary classification of features generated by GaussianBernoulli restricted Boltzmann machines (GRBMs). The system is evaluated on Track 1 of the 2nd CHiME challenge data. Overall, the best performance is(More)
In this paper, we study the role of a recently proposed feature enhancement technique in building HMM-based synthetic voices using reverberant speech data. The feature enhancement technique studied combines the advantages of missing data imputation and non-negative matrix factorization (NMF) based methods in cleaning up the reverberant features. Speaker(More)
This paper describes a novel two-stage dereverberation feature enhancement method for noise-robust automatic speech recognition. In the first stage, an estimate of the dereverberated speech is generated by matching the distribution of the observed reverberant speech to that of clean speech, in a decorrelated transformation domain that has a long temporal(More)