• Corpus ID: 108746063

Speech recognition in noisy environments

@inproceedings{Moreno1996SpeechRI,
  title={Speech recognition in noisy environments},
  author={Pedro J. Moreno},
  year={1996}
}
The accuracy of speech recognition systems degrades severely when the systems are operated in adverse acoustical environments. In recent years many approaches have been developed to address the problem of robust speech recognition, using feature-normalization algorithms, microphone arrays, representations based on human hearing, and other approaches. Nevertheless, to date the improvement in recognition accuracy afforded by such algorithms has been limited, in part because of inadequacies in… 
Microphone array processing for robust speech recognition
TLDR
A new approach to microphone-array processing is proposed in which the goal of the array processing is not to generate an enhanced output waveform but rather to generate a sequence of features which maximizes the likelihood of the correct hypothesis.
HMM Adaptation Using Statistical Linear Approximation for Robust Speech Recognition
TLDR
The proposed robustness method is highly attractive for the Distributed Speech Recognition (DSR) architecture, since there is no impact on the Front End structure and neither on the ASR topology.
Missing-Feature Approaches in Speech Recognition [ Improving recognition accuracy in noise by using partial spectrographic information ]
D espite decades of focused research on the problem, the accuracy of automatic speech recognition (ASR) systems is still adversely affected by noise and other sources of acoustical variability. For
Likelihood-Maximizing-BasedMultiband Spectral Subtraction for Robust Speech Recognition
TLDR
A novel approach for solving the problem of automatic speech recognition performance degradation by considering SS and the speech recognizer not as two independent entities cascaded together, but rather as two interconnected components of a single system, sharing the common goal of improved speech recognition accuracy.
Likelihood-Maximizing-Based Multiband Spectral Subtraction for Robust Speech Recognition
TLDR
A novel approach for solving the problem of automatic speech recognition performance degradation by considering SS and the speech recognizer not as two independent entities cascaded together, but rather as two interconnected components of a single system, sharing the common goal of improved speech recognition accuracy.
Cepstral compensation by polynomial approximation for environment-independent speech recognition
TLDR
This work introduces an approximation based method to compute the effects of the environment on the parameters of the PDF of clean speech, and performs compensation by vector polynomial approximations (VPS) for the effect of linear filtering and additive noise on the clean speech.
Model-space compensation of microphone and noise for speaker-independent speech recognition
  • Y. Gong
  • Physics, Computer Science
    2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).
  • 2003
TLDR
A method, called JAC (Joint compensation of Additive and Convolutive distortions), is presented that uses two log-spectral domain components in speech acoustic models to represent additive and convolutive distortions and reduces recognition word error rate by an order of magnitude.
Enhancement of adaptive de-correlation filtering separation model for robust speech recognition
  • R. Hu
  • Computer Science
  • 2007
TLDR
Improvements in speech spectral characteristics, prewhitening procedures are applied to flatten the long-term speech spectrum to improve adaptation robustness and decrease ADF estimation error, and block-iterative implementation and variable step-size methods are proposed for speedup convergence rate.
Learning Dynamic Noise Models from Noisy Speech for Robust Speech Recognition
TLDR
The approximate inference technique is used as an approximate E step in a generalized EM algorithm that learns the parameters of the noise model from a test utterance that performs as well as or signi cantly better than the non-adaptive algorithm, without the need for a separate training set of noise examples.
...
...

References

SHOWING 1-10 OF 52 REFERENCES
Acoustical and environmental robustness in automatic speech recognition
TLDR
This dissertation describes a number of algorithms developed to increase the robustness of automatic speech recognition systems with respect to changes in the environment, including the SNR-Dependent Cepstral Normalization, (SDCN) and the Codeword-Dependent Cep stral normalization (CDCN).
A unified approach for robust speech recognition
TLDR
Three techniques that share the same basic assumptions and internal structure but differ in whether they modify the incoming speech cepstra or whether they modifying the classifier statistics are presented, which is to unify these approaches to robust speech recognition.
APPROACHES TO ENVIRONMENT COMPENSATION IN AUTOMATIC SPEECH RECOGNITION
TLDR
Three new cepstral-domain compensation strategies, SNR based MultivaRiate gAussian based cepsTral normaliZation (SNR-based RATZ), STAtistical Reestimation of HMMs (STAR), and new CDCN (NCDCN) are described, which achieve improved performance through the use of better mathematical models which introduce strong structural constraints into the assumed distribution for speech.
Efficient joint compensation of speech for the effects of additive noise and linear filtering
  • Fu-hua Liu, A. Acero, R. Stern
  • Computer Science
    [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing
  • 1992
TLDR
Two algorithms are described that provide robustness for automatic speech recognition systems in a fashion that is suitable for real-time environmental normalization for workstations of moderate size and a modification of the more complex CDCN algorithm that enables it to perform environmental compensation in better than real time.
Model-based techniques for noise robust speech recognition
TLDR
The development of a model-based noise compensation technique, Parallel Model Combination, to alter the parameters of a set of Hidden Markov Model (HMM) based acoustic models, so that they reeect speech spoken in a new acoustic environment is detailed.
Adaptation to New Microphones Using Tied-Mixture Normalization
TLDR
Experimental results show that the proposed algorithm, combined with cepstrum mean subtraction, improves the recognition accuracy when the system is tested on a microphone with different characteristics than the one on which it was trained.
Rapid environment adaptation for robust speech recognition
TLDR
A rapid environment adaptation algorithm based on spectrum equalization (REALISE) improved recognition accuracy from 87% to 96% in a 250 Japanese word recognition task.
The HTK large vocabulary recognition system for the 1995 ARPA H3 task
TLDR
Developments of the HTK large vocabulary speech recognition system aimed at recognition of speech from the ARPA H3 task which contains data of a relatively low signal-to-noise ratio from unknown microphones are described.
Probabilistic optimum filtering for robust speech recognition
  • L. Neumeyer, M. Weintraub
  • Computer Science
    Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing
  • 1994
TLDR
A new mapping algorithm for speech recognition that relates the features of simultaneous recordings of clean and noisy speech to reduce recognition errors when the training and testing acoustic environments do not match is presented.
Signal Processing for Robust Speech Recognition
TLDR
Use of the various compensation algorithms in consort produces a reduction of error rates for SPHINX-II by as much as 40 percent relative to the rate achieved with cepstral mean normalization alone, in both development test sets and in the context of the 1993 ARPA CSR evaluations.
...
...