Strategies for modeling reverberant speech in the feature domain


The length of the room impulse response characterizing the acoustic path between speaker and microphone is significantly larger than the length of the analysis window used for feature extraction in automatic speech recognition (ASR) systems. Therefore, reverberation caused by multi-path propagation of sound waves from the speaker to distant-talking microphones has a dispersive effect on speech feature sequences. This dispersive effect causes a mismatch between the input speech and the acoustic models of the recognizer, usually trained on clean speech, and leads to a significant reduction of recognition performance. In this contribution, different strategies for obtaining acoustic models capturing the dispersive effect of reverberation are investigated in terms of modeling accuracy, flexibility with respect to changing reverberation conditions, effort for obtaining the reverberation representation and decoding complexity.

DOI: 10.1109/ICASSP.2009.4960436

Extracted Key Phrases

7 Figures and Tables

Cite this paper

@article{Sehr2009StrategiesFM, title={Strategies for modeling reverberant speech in the feature domain}, author={Armin Sehr and Walter Kellermann}, journal={2009 IEEE International Conference on Acoustics, Speech and Signal Processing}, year={2009}, pages={3725-3728} }