Speech recognition in noisy environments
@inproceedings{Moreno1996SpeechRI, title={Speech recognition in noisy environments}, author={Pedro J. Moreno}, year={1996} }
The accuracy of speech recognition systems degrades severely when the systems are operated in adverse acoustical environments. In recent years many approaches have been developed to address the problem of robust speech recognition, using feature-normalization algorithms, microphone arrays, representations based on human hearing, and other approaches.
Nevertheless, to date the improvement in recognition accuracy afforded by such algorithms has been limited, in part because of inadequacies in…
Figures from this paper
figure 2-1 figure 2-2 figure 2-3 figure 3-1 figure 4-2 figure 4-3 figure 4-4 figure 4-5 figure 4-6 figure 4-7 figure 4-8 figure 5-1 figure 6-1 figure 6-2 figure 6-3 figure 6-4 figure 6-5 figure 6-6 figure 6-7 figure 7-1 figure 7-2 figure 7-3 figure 8-1 figure 8-2 figure 8-3 figure 8-4 figure 8-5 figure 8-6 figure 8-7 figure 8-8 figure A-1 figure A-2 figure A-3
219 Citations
Microphone array processing for robust speech recognition
- Physics
- 2003
A new approach to microphone-array processing is proposed in which the goal of the array processing is not to generate an enhanced output waveform but rather to generate a sequence of features which maximizes the likelihood of the correct hypothesis.
HMM Adaptation Using Statistical Linear Approximation for Robust Speech Recognition
- Computer Science
- 2011
The proposed robustness method is highly attractive for the Distributed Speech Recognition (DSR) architecture, since there is no impact on the Front End structure and neither on the ASR topology.
Missing-Feature Approaches in Speech Recognition [ Improving recognition accuracy in noise by using partial spectrographic information ]
- Physics
- 2009
D espite decades of focused research on the problem, the accuracy of automatic speech recognition (ASR) systems is still adversely affected by noise and other sources of acoustical variability. For…
Likelihood-Maximizing-BasedMultiband Spectral Subtraction for Robust Speech Recognition
- Computer Science
- 2009
A novel approach for solving the problem of automatic speech recognition performance degradation by considering SS and the speech recognizer not as two independent entities cascaded together, but rather as two interconnected components of a single system, sharing the common goal of improved speech recognition accuracy.
Likelihood-Maximizing-Based Multiband Spectral Subtraction for Robust Speech Recognition
- Computer ScienceEURASIP J. Adv. Signal Process.
- 2009
A novel approach for solving the problem of automatic speech recognition performance degradation by considering SS and the speech recognizer not as two independent entities cascaded together, but rather as two interconnected components of a single system, sharing the common goal of improved speech recognition accuracy.
Cepstral compensation by polynomial approximation for environment-independent speech recognition
- Computer ScienceProceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96
- 1996
This work introduces an approximation based method to compute the effects of the environment on the parameters of the PDF of clean speech, and performs compensation by vector polynomial approximations (VPS) for the effect of linear filtering and additive noise on the clean speech.
Model-space compensation of microphone and noise for speaker-independent speech recognition
- Physics, Computer Science2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).
- 2003
A method, called JAC (Joint compensation of Additive and Convolutive distortions), is presented that uses two log-spectral domain components in speech acoustic models to represent additive and convolutive distortions and reduces recognition word error rate by an order of magnitude.
Enhancement of adaptive de-correlation filtering separation model for robust speech recognition
- Computer Science
- 2007
Improvements in speech spectral characteristics, prewhitening procedures are applied to flatten the long-term speech spectrum to improve adaptation robustness and decrease ADF estimation error, and block-iterative implementation and variable step-size methods are proposed for speedup convergence rate.
Learning Dynamic Noise Models from Noisy Speech for Robust Speech Recognition
- Computer Science
- 2001
The approximate inference technique is used as an approximate E step in a generalized EM algorithm that learns the parameters of the noise model from a test utterance that performs as well as or signi cantly better than the non-adaptive algorithm, without the need for a separate training set of noise examples.
Feature domain compensation of nonstationary noise for robust speech recognition
- Computer ScienceSpeech Commun.
- 2002
References
SHOWING 1-10 OF 52 REFERENCES
Acoustical and environmental robustness in automatic speech recognition
- Computer Science
- 1991
This dissertation describes a number of algorithms developed to increase the robustness of automatic speech recognition systems with respect to changes in the environment, including the SNR-Dependent Cepstral Normalization, (SDCN) and the Codeword-Dependent Cep stral normalization (CDCN).
A unified approach for robust speech recognition
- Computer ScienceEUROSPEECH
- 1995
Three techniques that share the same basic assumptions and internal structure but differ in whether they modify the incoming speech cepstra or whether they modifying the classifier statistics are presented, which is to unify these approaches to robust speech recognition.
APPROACHES TO ENVIRONMENT COMPENSATION IN AUTOMATIC SPEECH RECOGNITION
- Computer Science
- 1995
Three new cepstral-domain compensation strategies, SNR based MultivaRiate gAussian based cepsTral normaliZation (SNR-based RATZ), STAtistical Reestimation of HMMs (STAR), and new CDCN (NCDCN) are described, which achieve improved performance through the use of better mathematical models which introduce strong structural constraints into the assumed distribution for speech.
Efficient joint compensation of speech for the effects of additive noise and linear filtering
- Computer Science[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing
- 1992
Two algorithms are described that provide robustness for automatic speech recognition systems in a fashion that is suitable for real-time environmental normalization for workstations of moderate size and a modification of the more complex CDCN algorithm that enables it to perform environmental compensation in better than real time.
Model-based techniques for noise robust speech recognition
- Computer Science
- 1995
The development of a model-based noise compensation technique, Parallel Model Combination, to alter the parameters of a set of Hidden Markov Model (HMM) based acoustic models, so that they reeect speech spoken in a new acoustic environment is detailed.
Adaptation to New Microphones Using Tied-Mixture Normalization
- Computer ScienceHLT
- 1994
Experimental results show that the proposed algorithm, combined with cepstrum mean subtraction, improves the recognition accuracy when the system is tested on a microphone with different characteristics than the one on which it was trained.
Rapid environment adaptation for robust speech recognition
- Computer Science1995 International Conference on Acoustics, Speech, and Signal Processing
- 1995
A rapid environment adaptation algorithm based on spectrum equalization (REALISE) improved recognition accuracy from 87% to 96% in a 250 Japanese word recognition task.
The HTK large vocabulary recognition system for the 1995 ARPA H3 task
- Computer Science
- 1996
Developments of the HTK large vocabulary speech recognition system aimed at recognition of speech from the ARPA H3 task which contains data of a relatively low signal-to-noise ratio from unknown microphones are described.
Probabilistic optimum filtering for robust speech recognition
- Computer ScienceProceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing
- 1994
A new mapping algorithm for speech recognition that relates the features of simultaneous recordings of clean and noisy speech to reduce recognition errors when the training and testing acoustic environments do not match is presented.
Signal Processing for Robust Speech Recognition
- Computer ScienceHLT
- 1994
Use of the various compensation algorithms in consort produces a reduction of error rates for SPHINX-II by as much as 40 percent relative to the rate achieved with cepstral mean normalization alone, in both development test sets and in the context of the 1993 ARPA CSR evaluations.