Speech-to-video synthesis using facial animation parameters

The presence of visual information in addition to audio could improve speech understanding in noisy environments. This additional information could be especially useful for people with impaired hearing who are able to speechread. This paper focuses on the problem of synthesizing the Facial Animation Parameters (FAPs), supported by the MPEG-4 standard for the visual representation of speech, from a narrowband acoustic speech (telephone) signal. A correlation Hidden Markov Model (CHMM) system for…