Beyond NMF: Time-Domain Audio Source Separation without Phase Reconstruction


This paper presents a new fundamental technique for source separation of single-channel audio signals. Although non-negative matrix factorization (NMF) has recently become very popular for music source separation, it deals only with the amplitude or power of the spectrogram of a given mixture signal and completely discards the phase. The component spectrograms are typically estimated using a Wiener filter that reuses the phase of the mixture spectrogram, but such rough phase reconstruction makes it hard to recover high-quality source signals because the estimated spectro-grams are inconsistent, i.e., they do not correspond to any real time-domain signals. To avoid the frequency-domain phase reconstruction, we use positive semidefinite tensor factorization (PSDTF) for directly estimating source signals from the mixture signal in the time domain. Since PS-DTF is a natural extension of NMF, an efficient multiplica-tive update algorithm for PSDTF can be derived. Experimental results show that PSDTF outperforms conventional NMF variants in terms of source separation quality.

14th International Society for Music Information Retrieval Conference

  • 2013