Robust Audio-Visual Speech Recognition Based on Late Integration

@article{Lee2008RobustAS,
  title={Robust Audio-Visual Speech Recognition Based on Late Integration},
  author={Jong-Seok Lee and Cheol Hoon Park},
  journal={IEEE Transactions on Multimedia},
  year={2008},
  volume={10},
  pages={767-779}
}
Audio-visual speech recognition (AVSR) using acoustic and visual signals of speech has received attention because of its robustness in noisy environments. In this paper, we present a late integration scheme-based AVSR system whose robustness under various noise conditions is improved by enhancing the performance of the three parts composing the system. First, we improve the performance of the visual subsystem by using the stochastic optimization method for the hidden Markov models as the speech… CONTINUE READING
Highly Cited
This paper has 39 citations. REVIEW CITATIONS

Citations

Publications citing this paper.
Showing 1-10 of 24 extracted citations

Improved Bimodal Speech Recognition Study Based on Product Hidden Markov Model

Int. J. Fuzzy Logic and Intelligent Systems • 2013
View 5 Excerpts
Highly Influenced

Word Spotting in Silent Lip Videos

2018 IEEE Winter Conference on Applications of Computer Vision (WACV) • 2018
View 1 Excerpt

References

Publications referenced by this paper.
Showing 1-10 of 38 references

Spoken Language Processing: A Guide to Theory, Algorithm, and System Development

X. Huang, A. Acero, H.-W. Hon
Upper Saddle River, • 2001
View 6 Excerpts
Highly Influenced

Dynamic Stream Weight Modeling for Audio-Visual Speech Recognition

2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07 • 2007
View 1 Excerpt

Audio-Visual Speech Recognition: Stochastic Optimization of Hidden Markov Models, Modeling of Interframe Correlations and Integration With Neural Networks,

J.-S. Lee
Ph.D. dissertation, Dept. Elect. Eng. Comput • 2006
View 1 Excerpt

Training Hidden Markov Models by Hybrid Simulated Annealing for Visual Speech Recognition

2006 IEEE International Conference on Systems, Man and Cybernetics • 2006
View 4 Excerpts

Visual model structures and synchrony constraints for audio-visual speech recognition

IEEE Transactions on Audio, Speech, and Language Processing • 2006
View 1 Excerpt

A stream-weight optimization method for multi-stream HMMs based on likelihood value normalization

Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. • 2005
View 1 Excerpt

Similar Papers

Loading similar papers…