Non-negative Hidden Markov Modeling of Audio with Application to Source Separation


In recent years, there has been a great deal of work in modeling audio using non-negative matrix factorization and its probabilistic counterparts as they yield rich models that are very useful for source separation and automatic music transcription. Given a sound source, these algorithms learn a dictionary of spectral vectors to best explain it. This dictionary is however learned in a manner that disregards a very important aspect of sound, its temporal structure. We propose a novel algorithm, the non-negative hidden Markov model (N-HMM), that extends the aforementioned models by jointly learning several small spectral dictionaries as well as a Markov chain that describes the structure of changes between these dictionaries. We also extend this algorithm to the non-negative factorial hidden Markov model (N-FHMM) to model sound mixtures, and demonstrate that it yields superior performance in single channel source separation tasks.

DOI: 10.1007/978-3-642-15995-4_18

Extracted Key Phrases

3 Figures and Tables

Citations per Year

109 Citations

Semantic Scholar estimates that this publication has 109 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Mysore2010NonnegativeHM, title={Non-negative Hidden Markov Modeling of Audio with Application to Source Separation}, author={Gautham J. Mysore and Paris Smaragdis and Bhiksha Raj}, booktitle={LVA/ICA}, year={2010} }