Audio-visual speech recognition using depth information from the Kinect in noisy video conditions

@inproceedings{Galatas2012AudiovisualSR,
  title={Audio-visual speech recognition using depth information from the Kinect in noisy video conditions},
  author={Georgios Galatas and Gerasimos Potamianos and Fillia Makedon},
  booktitle={PETRA},
  year={2012}
}
In this paper we build on our recent work, where we successfully incorporated facial depth data of a speaker captured by the Microsoft Kinect device, as a third data stream in an audio-visual automatic speech recognizer. In particular, we focus our interest on whether the depth stream provides sufficient speech information that can improve system robustness… CONTINUE READING