Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM - MAP decoding and evaluation

Abstract

The Hidden Dynamic Model (HDM) has been an attractive acoustic modeling approach because it provides a computational model for coarticulation and the dynamics of human speech. However, the lack of a direct decoding algorithm has been a barrier to research progress on HDM. We have developed a new HDM-based acoustic model, the Hidden-Trajectory HMM (HTHMM), which combines the state/mixture topology of a traditional monophone HMM with a target-directed hidden-trajectory model (a special form of HDM) for coarticulation modeling. Because the classical Viterbi algorithm is not admissible, we have developed a novel MAP decoding algorithm for HTHMM that correctly takes the hidden continuous trajectory into account. This paper introduces our new HTHMM decoder that allows for the first time to evaluate an HDM-type model by direct decoding instead of N -best rescoring. Using direct decoding, we demonstrate that the coarticulatory mechanism of our HTHMM matches traditional context-dependent modeling (enumeration of model parameters): The context-independent HTHMM has slightly better accuracy than a crossword-triphone HMM on the Aurora2 task. The decoder also enables us to include state-boundary optimization into the HDM/HTHMM training procedure. This paper presents the detailed decoding algorithm and evaluation results, while in [1] we present the HTHMM model itself and parameter training.

DOI: 10.1109/ICASSP.2003.1198889

Extracted Key Phrases

2 Figures and Tables

Cite this paper

@inproceedings{Seide2003CoarticulationMB, title={Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM - MAP decoding and evaluation}, author={Frank Seide and Jian-Lai Zhou and Li Deng}, booktitle={ICASSP}, year={2003} }