Language model adaptation using auto-induced semantic structures in a voice search system

Abstract

In this paper, we study how to generate in-domain data for statistical language model adaptation in a Chinese voice search dialogue system. Given limited amount of in-domain data, we use unsupervised clustering to induce semantic classes and structures from the first part of test data. These structures are further augmented with domain information to generate large amount of in-domain data. Lastly we test on the second part of test data and get a improvement of speech recognition for 6.2%.

5 Figures and Tables

Cite this paper

@article{Li2009LanguageMA, title={Language model adaptation using auto-induced semantic structures in a voice search system}, author={Yali Li and Ta Li and Yonghong Yan}, journal={2009 IEEE International Conference on Intelligent Computing and Intelligent Systems}, year={2009}, volume={3}, pages={350-353} }