Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions
Sound source localization from a binaural input is a challenging problem, particularly when multiple sources are active simultaneously and reverberation or background noise are present. In this work, we investigate a multi-source localization framework in which monaural source segregation is used as a mechanism to increase the robustness of azimuth estimates from a binaural input. We demonstrate performance improvement relative to binaural only methods assuming a known number of spatially stationary sources. We also propose a flexible azimuth-dependent model of binaural features that independently captures characteristics of the binaural setup and environmental conditions, allowing for adaptation to new environments or calibration to an unseen binaural setup. Results with both simulated and recorded impulse responses show that robust performance can be achieved with limited prior training.