Supervised non-euclidean sparse NMF via bilevel optimization with applications to speech enhancement


Traditionally, NMF algorithms consist of two separate stages: a training stage, in which a generative model is learned; and a testing stage in which the pre-learned model is used in a high level task such as enhancement, separation, or classification. As an alternative, we propose a task-supervised NMF method for the adaptation of the basis spectra learned in the first stage to enhance the performance on the specific task used in the second stage. We cast this problem as a bilevel optimization program that can be efficiently solved via stochastic gradient descent. The proposed approach is general enough to handle sparsity priors of the activations, and allow non-Euclidean data terms such as β-divergences. The framework is evaluated on single-channel speech enhancement tasks.

DOI: 10.1109/HSCMA.2014.6843241

Extracted Key Phrases

3 Figures and Tables

Showing 1-10 of 25 references

Nonnegative dynamical system with application to speech and audio

  • C Févotte, J Le Roux, J R Hershey
  • 2013
Showing 1-10 of 10 extracted citations