Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video

  title={Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video},
  author={Rishabh Garg and Ruohan Gao and Kristen Grauman},
  booktitle={British Machine Vision Conference},
Binaural audio provides human listeners with an immersive spatial sound experience, but most existing videos lack binaural audio recordings. We propose an audio spatialization method that draws on visual information in videos to convert their monaural (single-channel) audio to binaural audio. Whereas existing approaches leverage visual features extracted directly from video frames, our approach explicitly disentangles the geometric cues present in the visual stream to guide the learning process… 

