Beyond NMF: Time-Domain Audio Source Separation without Phase Reconstruction


This paper presents a new fundamental technique for source separation of single-channel audio signals. Although nonnegative matrix factorization (NMF) has recently become very popular for music source separation, it deals only with the amplitude or power of the spectrogram of a given mixture signal and completely discards the phase. The component spectrograms are typically estimated using a Wiener filter that reuses the phase of the mixture spectrogram, but such rough phase reconstruction makes it hard to recover high-quality source signals because the estimated spectrograms are inconsistent, i.e., they do not correspond to any real time-domain signals. To avoid the frequency-domain phase reconstruction, we use positive semidefinite tensor factorization (PSDTF) for directly estimating source signals from the mixture signal in the time domain. Since PSDTF is a natural extension of NMF, an efficient multiplicative update algorithm for PSDTF can be derived. Experimental results show that PSDTF outperforms conventional NMF variants in terms of source separation quality.

View Slides

Extracted Key Phrases

5 Figures and Tables

Cite this paper

@inproceedings{Yoshii2013BeyondNT, title={Beyond NMF: Time-Domain Audio Source Separation without Phase Reconstruction}, author={Kazuyoshi Yoshii and Ryota Tomioka and Daichi Mochihashi and Masataka Goto}, booktitle={ISMIR}, year={2013} }