Audiovisual speech inversion by switching dynamical modeling governed by a Hidden Markov process


We propose a unified framework to recover articulation from audiovisual speech. The nonlinear audiovisual-to-articulatory mapping is modeled by means of a switching linear dynamical system. Switching is governed by a state sequence determined via a Hidden Markov Model alignment process. Mel Frequency Cepstral Coefficients are extracted from audio while… (More)
6 Figures and Tables

