Variational Inference in Non-negative Factorial Hidden Markov Models for Efficient Audio Source Separation

Abstract

The past decade has seen substantial work on the use of non-negative matrix factorization and its probabilistic counterparts for audio source separation. Although able to capture audio spectral structure well, these models neglect the non-stationarity and temporal dynamics that are important properties of audio. The recently proposed non-negative factorial hidden Markov model (N-FHMM) introduces a temporal dimension and improves source separation performance. However, the factorial nature of this model makes the complexity of inference exponential in the number of sound sources. Here, we present a Bayesian variant of the N-FHMM suited to an efficient variational inference algorithm, whose complexity is linear in the number of sound sources. Our algorithm performs comparably to exact inference in the original NFHMM but is significantly faster. In typical configurations of the N-FHMM, our method achieves around a 30x increase in speed.

Extracted Key Phrases

10 Figures and Tables

Cite this paper

@inproceedings{Mysore2012VariationalII, title={Variational Inference in Non-negative Factorial Hidden Markov Models for Efficient Audio Source Separation}, author={Gautham J. Mysore and Maneesh Sahani}, booktitle={ICML}, year={2012} }