Experimental Performance Assessment of a Particle Filter with Voice Activity Data Fusion for Acoustic Speaker Tracking

Abstract

The problem of acoustic source localization and tracking (ASLT) in reverberant environments by means of a microphone array constitutes a challenging task from many viewpoints. One of the main issues when considering real-world situations involving human speakers is the presence of silence gaps in the speech, which can easily send the tracking algorithm off-track, even in practical environments with low to moderate noise and reverberation levels. This work is concerned with an implementation of the ASLT algorithm proposed in E. Lehmann et al., which circumvents this problem by integrating measurements from a voice activity detector (VAD) within the tracking algorithm framework. The tracking performance of this method is tested experimentally using audio data recorded in a real reverberant room. To this purpose, we describe a quick and efficient way of determining the ground-truth speaker location versus time, an operation that is not always easy to perform. The experimental results confirm the improved robustness of the method presented in E. Lehmann et al., (compared to a previously proposed non-VAD ASLT algorithm) when tracking sources emitting real-world speech signals, which typically involve significant silence gaps between utterances

4 Figures and Tables

Cite this paper

@article{Lehmann2006ExperimentalPA, title={Experimental Performance Assessment of a Particle Filter with Voice Activity Data Fusion for Acoustic Speaker Tracking}, author={E. A. Lehmann and A. Johansson}, journal={Proceedings of the 7th Nordic Signal Processing Symposium - NORSIG 2006}, year={2006}, pages={126-129} }