DBN based multi-stream models for audio-visual speech recognition

  title={DBN based multi-stream models for audio-visual speech recognition},
  author={John N. Gowdy and Amarnag Subramanya and Chris D. Bartels and Jeff A. Bilmes},
  journal={2004 IEEE International Conference on Acoustics, Speech, and Signal Processing},
In this paper, we propose a model based on dynamic Bayesian networks (DBN) to integrate information from multiple audio and visual streams. We also compare the DBN based system (implemented using the Graphical Model Toolkit (GMTK)) with a classical HMM (implemented in the Hidden Markov Model Toolkit (HTK)) for both the single and two stream integration problems. We also propose a new model (mixed integration) to integrate information from three or more streams derived from different modalities… CONTINUE READING
Highly Cited
This paper has 86 citations. REVIEW CITATIONS

6 Figures & Tables



Citations per Year

87 Citations

Semantic Scholar estimates that this publication has 87 citations based on the available data.

See our FAQ for additional information.