Large-vocabulary audio-visual speech recognition by machines and humans

  title={Large-vocabulary audio-visual speech recognition by machines and humans},
  author={Gerasimos Potamianos and Chalapathy Neti and Giridharan Iyengar and Eric D Helmuth},
We compare automatic recognition with human perception of audio-visual speech, in the large-vocabulary, continuous speech recognition (LVCSR) domain. Specifically, we study the benefit of the visual modality for both machines and humans, when combined with audio degraded by speech-babble noise at various signal-to-noise ratios (SNRs). We first consider an automatic speechreading system with a pixel based visual front end that uses feature fusion for bimodal integration, and we compare its… CONTINUE READING
Highly Cited
This paper has 45 citations. REVIEW CITATIONS


Publications referenced by this paper.
Showing 1-10 of 16 references

Visionary speech: Looking ahead to practical speechreading systems

  • M. E. Hennecke, D. G. Stork, K. V. Prasad
  • [1], pp. 331–349, 1996.
  • 1996
Highly Influential
5 Excerpts

Audiovisual speech processing. Lip reading and lip synchronization

  • T. Chen
  • IEEE Signal Process. Mag. , vol. 18, pp. 9–21…
  • 2001
1 Excerpt

Differences in visual intelligibility across talkers

  • P. B. Kricos
  • [1], pp. 43–53, 1996.
  • 1996
1 Excerpt

Exploiting sensor fusion architectures and stimuli complementarity in AV speech recognition

  • J. Robert-Ribes, M. Piquemal, Schwartz, J.-L., P. Escudier
  • [1], pp. 193–210, 1996.
  • 1996
2 Excerpts

Speechreading by Humans and Machines

  • D. G. Stork, M. E. Hennecke
  • 1996
3 Excerpts

Word Recognition in Speechreading

  • L. E. Bernstein, E. T. Auer
  • [1], pp. 17–26, 1996.
  • 1996
1 Excerpt

Similar Papers

Loading similar papers…