Text-to-speech inspired duration modeling for improved whole-word acoustic models

Abstract

In the construction of whole-word acoustic models, we have previously demonstrated substantial gains by using MAP estimation to introduce a simple prior model of phonetic timing. Based solely on the word’s phonetic (dictionary) pronunciation, this simple model included no information about the individual durations of constituent phones. However, the problem… (More)

4 Figures and Tables

Topics

  • Presentations referencing similar topics