Improved context-dependent acoustic modeling for continuous Chinese speech recognition

Abstract

This paper describes the new framework of context-dependent (CD) Initial/Final (IF) acoustic modeling using the decision tree based state tying for continuous Chinese speech recognition. The Extended Initial/Final (XIF) set is chosen as the basic speech recognition unit (SRU) set according to the Chinese language characteristics, which outperforms the standard IF set. An adaptive mixture increasing strategy is applied when splitting the single Gaussian into mixed Gaussians in each tied state after the decision tree has been constructed. Our experimental results show that these two improvements are helpful to the acoustic modeling of Chinese speech recognition and that the CD XIF model outperforms the baseline syllable model over 30%.

Extracted Key Phrases

7 Figures and Tables

Cite this paper

@inproceedings{Zhang2001ImprovedCA, title={Improved context-dependent acoustic modeling for continuous Chinese speech recognition}, author={Jiyong Zhang and Thomas Fang Zheng and Jing Li and Chunhua Luo and Guoliang Zhang}, booktitle={INTERSPEECH}, year={2001} }