Lip Reading Sentences in the Wild
@article{Chung2017LipRS, title={Lip Reading Sentences in the Wild}, author={Joon Son Chung and A. Senior and Oriol Vinyals and Andrew Zisserman}, journal={2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2017}, pages={3444-3453} }
The goal of this work is to recognise phrases and sentences being spoken by a talking face, with or without the audio. Unlike previous works that have focussed on recognising a limited number of words or phrases, we tackle lip reading as an open-world problem – unconstrained natural language sentences, and in the wild videos. Our key contributions are: (1) a Watch, Listen, Attend and Spell (WLAS) network that learns to transcribe videos of mouth motion to characters, (2) a curriculum… CONTINUE READING
Supplemental Content
Presentation Slides
Figures, Tables, and Topics from this paper
Paper Mentions
342 Citations
Deep Audio-Visual Speech Recognition
- Computer Science, Medicine
- IEEE transactions on pattern analysis and machine intelligence
- 2018
- 149
- PDF
Experimenting with lipreading for large vocabulary continuous speech recognition
- Computer Science
- Journal on Multimodal User Interfaces
- 2018
- 1
A Lip Reading Model Using CNN with Batch Normalization
- Computer Science
- 2018 Eleventh International Conference on Contemporary Computing (IC3)
- 2018
- 4
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis
- Computer Science, Engineering
- 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020
- 6
- Highly Influenced
- PDF
Word Spotting in Silent Lip Videos
- Computer Science
- 2018 IEEE Winter Conference on Applications of Computer Vision (WACV)
- 2018
- 12
- Highly Influenced
- PDF
Deep Learning for Lip Reading using Audio-Visual Information for Urdu Language
- Computer Science, Engineering
- ArXiv
- 2018
- 5
- Highly Influenced
- PDF
References
SHOWING 1-10 OF 49 REFERENCES
Lipreading with long short-term memory
- Computer Science
- 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2016
- 131
- PDF
Attention-Based Models for Speech Recognition
- Computer Science, Mathematics
- NIPS
- 2015
- 1,442
- Highly Influential
- PDF
Deep multimodal learning for Audio-Visual Speech Recognition
- Computer Science
- 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2015
- 148
- PDF