Robust feature extraction to utterance fluctuations due to articulation disorders based on sparse expression

Abstract

We investigated the speech recognition of a person with articulation disorders resulting from athetoid cerebral palsy. Recently, the accuracy of speaker-independent speech recognition has been remarkably improved by the use of stochastic modeling of speech. However, the use of those acoustic models causes degradation of speech recognition for a person with different speech styles (e.g., articulation disorders). In this paper, we discuss our efforts to build an acoustic model for a person with articulation disorders. The articulation of the first utterance tends to become more unstable than other utterances due to strain on speech-related muscles, and that causes degradation of speech recognition. Therefore, we propose a robust feature extraction method based on exemplar-based sparse representation using NMF (Non-negative Matrix Factorization). In our method, the unstable first utterance is expressed as a linear and non-negative combination of a small number of bases created using the more stable utterances of a person with articulation disorders. Then, we use the coefficient of combination as an acoustic feature. Its effectiveness has been confirmed by word-recognition experiments.

Extracted Key Phrases

7 Figures and Tables

Cite this paper

@article{Yoshioka2012RobustFE, title={Robust feature extraction to utterance fluctuations due to articulation disorders based on sparse expression}, author={Toshiya Yoshioka and Ryoichi Takashima and Tetsuya Takiguchi and Yasuo Ariki}, journal={Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference}, year={2012}, pages={1-4} }