Learn More
In this paper, we discuss the evaluation of blind audio source separation (BASS) algorithms. Depending on the exact application, different distortions can be allowed between an estimated source and the wanted true source. We consider four different sets of such allowed distortions, from time-invariant gains to time-varying filters. In each case, we(More)
This letter describes algorithms for nonnegative matrix factorization (NMF) with the β-divergence (β-NMF). The β-divergence is a family of cost functions parameterized by a single shape parameter β that takes the Euclidean distance, the Kullback-Leibler divergence, and the Itakura-Saito divergence as special cases (β = 2, 1, 0 respectively). The proposed(More)
This letter presents theoretical, algorithmic, and experimental results about nonnegative matrix factorization (NMF) with the Itakura-Saito (IS) divergence. We describe how IS-NMF is underlaid by a well-defined statistical model of superimposed gaussian components and is equivalent to maximum likelihood estimation of variance parameters. This setting can(More)
We consider inference in a general data-driven object-based model of multichannel audio data, assumed generated as a possibly underdetermined convolutive mixture of source signals. We work in the short-time Fourier transform (STFT) domain, where convolution is routinely approximated as linear instantaneous mixing in each frequency band. Each source STFT is(More)
Extracting the main melody from a polyphonic music recording seems natural even to untrained human listeners. To a certain extent it is related to the concept of source separation, with the human ability of focusing on a specific source in order to extract relevant information. In this paper, we propose a new approach for the estimation and extraction of(More)
This paper describes algorithms for nonnegative matrix factorization (NMF) with the β-divergence (β-NMF). The β-divergence is a family of cost functions parametrized by a single shape parameter β that takes the Euclidean distance, the Kullback-Leibler divergence and the Itakura-Saito divergence as special cases (β = 2, 1, 0 respectively). The proposed(More)
We present two improvements/extensions of a previous deterministic blind source separation (BSS) technique, by Belouchrani and Amin, that involves joint-diagonalization of a set of Cohen's class spatial time-frequency distributions. The first contribution concerns the extension of the BSS technique to the stochastic case using spatial Wigner-Ville spectrum.(More)
In this paper we propose a simple time-frequency Gaussian model of audio signals that allows for separation of possibly underdetermined and noisy linear instantaneous mixtures. An efficient EM algorithm is proposed to estimate the mixing matrix, the noise covariance and covariances of the source t-f coefficients over a chosen frame/subband tiling of the(More)
We propose an unsupervised inference procedure for audio source separation. Components in nonnegative matrix factorization (NMF) are grouped automatically in audio sources via a penalized maximum likelihood approach. The penalty term we introduce favors sparsity at the group level, and is motivated by the assumption that the local amplitude of the sources(More)
Nonnegative matrix factorization (NMF) with the Itakura-Saito divergence has proven efficient for audio source separation and music transcription, where the signal power spectrogram is factored into a “dictionary” matrix times an “activation” matrix. Given the nature of audio signals it is expected that the activation(More)