Learn More
In this paper, we discuss the evaluation of blind audio source separation (BASS) algorithms. Depending on the exact application, different distortions can be allowed between an estimated source and the wanted true source. We consider four different sets of such allowed distortions, from time-invariant gains to time-varying filters. In each case, we(More)
This letter describes algorithms for nonnegative matrix factorization (NMF) with the β-divergence (β-NMF). The β-divergence is a family of cost functions parameterized by a single shape parameter β that takes the Euclidean distance, the Kullback-Leibler divergence, and the Itakura-Saito divergence as special cases (β = 2, 1, 0 respectively). The proposed(More)
This letter presents theoretical, algorithmic, and experimental results about nonnegative matrix factorization (NMF) with the Itakura-Saito (IS) divergence. We describe how IS-NMF is underlaid by a well-defined statistical model of superimposed gaussian components and is equivalent to maximum likelihood estimation of variance parameters. This setting can(More)
We consider inference in a general data-driven object-based model of multichannel audio data, assumed generated as a possibly underdetermined convolutive mixture of source signals. We work in the short-time Fourier transform (STFT) domain, where convolution is routinely approximated as linear instantaneous mixing in each frequency band. Each source STFT is(More)
We present two improvements/extensions of a previous deterministic blind source separation (BSS) technique, by Belouchrani and Amin, that involves joint-diagonalization of a set of Cohen's class spatial time-frequency distributions. The first contribution concerns the extension of the BSS technique to the stochastic case using spatial Wigner-Ville spectrum.(More)
This paper describes algorithms for nonnegative matrix factorization (NMF) with the β-divergence (β-NMF). The β-divergence is a family of cost functions parametrized by a single shape parameter β that takes the Euclidean distance, the Kullback-Leibler divergence and the Itakura-Saito divergence as special cases (β = 2, 1, 0 respectively). The proposed(More)
Extracting the main melody from a polyphonic music recording seems natural even to untrained human listeners. To a certain extent it is related to the concept of source separation, with the human ability of focusing on a specific source in order to extract relevant information. In this paper, we propose a new approach for the estimation and extraction of(More)
In this paper we propose a simple time-frequency Gaussian model of audio signals that allows for separation of possibly underdetermined and noisy linear instantaneous mixtures. An efficient EM algorithm is proposed to estimate the mixing matrix, the noise covariance and covariances of the source t-f coefficients over a chosen frame/subband tiling of the(More)
Nonnegative matrix factorization (NMF) with the Itakura-Saito divergence has proven efficient for audio source separation and music transcription, where the signal power spectrogram is factored into a “dictionary” matrix times an “activation” matrix. Given the nature of audio signals it is expected that the activation(More)
We formulate a novel extension of nonnegative matrix fac-torization (NMF) to take into account partial information on source-specific activity in the spectrogram. This information comes in the form of masking coefficients, such as those found in an ideal binary mask. We show that state-of-the-art results in source separation may be achieved with only a(More)