Out of Time: Automated Lip Sync in the Wild


The goal of this work is to determine the audio-video synchronisation between mouth motion and speech in a video. We propose a two-stream ConvNet architecture that enables the mapping between the sound and the mouth images to be trained end-to-end from unlabelled data. The trained network is used to determine the lip-sync error in a video. We apply the… (More)
DOI: 10.1007/978-3-319-54427-4_19


14 Figures and Tables


Citations per Year

Citation Velocity: 10

Averaging 10 citations per year over the last 2 years.

Learn more about how we calculate this metric in our FAQ.

Cite this paper

@inproceedings{Chung2016OutOT, title={Out of Time: Automated Lip Sync in the Wild}, author={Joon Son Chung and Andrew Zisserman}, booktitle={ACCV Workshops}, year={2016} }