A comparison of three non-linear observation models for noisy speech features

Abstract

This paper reports our recent efforts to develop a uni£ed, non-linear, stochastic model for estimating and removing the effects of additive noise on speech cepstra. The complete system consists of prior models for speech and noise, an observation model, and an inference algorithm. The observation model quanti£es the relationship between clean speech, noise, and the noisy observation. Since it is expressed in terms of the log Melfrequency £lter-bank features, it is non-linear. The inference algorithm is the procedure by which the clean speech and noise are estimated from the noisy observation. The most critical component of the system is the observation model. This paper derives a new approximation strategy and compares it with two existing approximations. It is shown that the new approximation uses half the calculation, and produces equivalent or improved word accuracy scores, when compared to previous techniques. We present noise-robust recognition results on the standard Aurora 2 task.

Extracted Key Phrases

2 Figures and Tables

Cite this paper

@inproceedings{Droppo2003ACO, title={A comparison of three non-linear observation models for noisy speech features}, author={Jasha Droppo and Li Deng and Alex Acero}, booktitle={INTERSPEECH}, year={2003} }