Universal speech models for speaker independent single channel source separation


Supervised and semi-supervised source separation algorithms based on non-negative matrix factorization have been shown to be quite effective. However, they require isolated training examples of one or more sources, which is often difficult to obtain. This limits the practical applicability of these algorithms. We examine the problem of efficiently utilizing general training data in the absence of specific training examples. Specifically, we propose a method to learn a universal speech model from a general corpus of speech and show how to use this model to separate speech from other sound sources. This model is used in lieu of a speech model trained on speaker-dependent training examples, and thus circumvents the aforementioned problem. Our experimental results show that our method achieves nearly the same performance as when speaker-dependent training examples are used. Furthermore, we show that our method improves performance when training data of the non-speech source is available.

DOI: 10.1109/ICASSP.2013.6637625

Extracted Key Phrases

6 Figures and Tables

Showing 1-10 of 12 references
Showing 1-10 of 38 extracted citations


Citations per Year

53 Citations

Semantic Scholar estimates that this publication has received between 38 and 85 citations based on the available data.

See our FAQ for additional information.