Learning Speech-driven 3D Conversational Gestures from Video

  author={Ikhsanul Habibie and Weipeng Xu and Dushyant Mehta and Lingjie Liu and Hans-Peter Seidel and Gerard Pons-Moll and Mohamed A. Elgharib and Christian Theobalt},
  journal={Proceedings of the 21st ACM International Conference on Intelligent Virtual Agents},
  • Published 13 February 2021
  • Computer Science
We propose the first approach to synthesize the synchronous 3D conversational body and hand gestures, as well as 3D face and head animations, of a virtual character from speech input. Our algorithm uses a CNN architecture that leverages the inherent correlation between facial expression and hand gestures. Synthesis of conversational body gestures is a multi-modal problem since many similar gestures can plausibly accompany the same input speech. To synthesize plausible body gestures in this… 

