Learn More
Monaural source separation is useful for many real-world applications though it is a challenging problem. In this paper, we study deep learning for monaural speech separation. We propose the joint optimization of the deep learning models (deep neural networks and recurrent neural networks) with an extra masking layer, which enforces a reconstruction(More)
We address a problem of separating drum sources from monaural mixtures of polyphonic music containing various pitched instruments as well as drums. We consider a spectrogram of music, described by a matrix where each row is associated with intensities of a frequency over time. We employ a joint decomposition to several spectrogram matrices that include two(More)
We address a problem of separating drums from polyphonic music containing various pitched instruments as well as drums. Nonnegative matrix factorization (NMF) was successfully applied to spectrograms of music to learn basis vectors, followed by support vector machine (SVM) to classify basis vectors into ones associated with drums (rhythmic source) only and(More)
Monaural source separation is important for many real world applications. It is challenging since only single channel information is available. In this paper, we explore using deep recurrent neural networks for singing voice separation from monaural recordings in a supervised setting. Deep recurrent neural networks with different temporal connections are(More)
Based on the assumption that there exists a neu-ral network that efficiently represents a set of Boolean functions between all binary inputs and outputs, we propose a process for developing and deploying neural networks whose weight parameters , bias terms, input, and intermediate hidden layer output signals, are all binary-valued, and require only basic(More)
We propose a novel extension of Nonnegative Matrix Factorization (NMF) that models a signal with multiple local dictionaries activated sparsely. This set of local dictionaries for a source, e.g., speech, disjointly constitute a superset that is more discriminative than an ordinary NMF dictionary, because its local structures represent the source's manifold(More)
We present two complementary topic models to address the analysis of mixture data lying on manifolds. First, we propose a quantization method with an additional mid-layer latent variable, which selects only data points that best preserve the manifold structure of the input data. In order to address the case of modeling all the in-between parts of that(More)
In this paper we present a method for polyphonic music source separation from their monaural mixture, where the underlying assumption is that the harmonic structure of a musical instrument remains roughly the same even if it is played at various pitches and is recorded in various mixing environments. We incorporate with nonneg-ativity, shift-invariance, and(More)
Monaural source separation is important for many real world applications. It is challenging because, with only a single channel of information available, without any constraints, an infinite number of solutions are possible. In this paper, we explore joint optimization of masking functions and deep recurrent neural networks for monaural source separation(More)
This paper presents a single channel source separation method based on an extension of Nonnegative Matrix Factorization (NMF) algorithm by smoothing the original posterior probabilities with an additional Markov Random Fields (MRF) structure. Our method is based on the alternative interpretation of NMF with β-divergence as latent variable models. By(More)