Duration modeling with expanded HMM applied to speech recognition


In this paper, the occupancy of the HMM states is modeled by means of a Markov chain. A linear estimator is introduced to compute the probabilities of the Markov chain. The distribution functions (DF) represents accurately the observed data. Representing the DF as a Markov chain allows the use of standard HMM recognizers. The increase of complexity is negligible in training and strongly limited during recognition. Experiments performed on acoustic-phonetic decoding shows how the phone recognition rate increases from 60.6 to 61.1. Furthermore, on a task of database inquires, where phones are used as subword units, the correct word rate increases from 88.2 to 88.4.

Extracted Key Phrases

2 Figures and Tables

Cite this paper

@inproceedings{Bonafonte1996DurationMW, title={Duration modeling with expanded HMM applied to speech recognition}, author={Antonio Bonafonte and Josep Vidal and Albino Nogueiras}, booktitle={ICSLP}, year={1996} }