• Publications
  • Influence
Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation
TLDR
We explore joint optimization of masking functions and deep recurrent neural networks for monaural source separation tasks, including speech separation, singing voice separation, and speech denoising. Expand
  • 304
  • 28
  • PDF
Deep learning for monaural speech separation
TLDR
We propose the joint optimization of the deep learning models (deep neural networks and recurrent neural networks) with an extra masking layer, which enforces a reconstruction constraint. Expand
  • 319
  • 28
  • PDF
Singing-Voice Separation from Monaural Recordings using Deep Recurrent Neural Networks
TLDR
We explore the use of deep recurrent neural networks for singing voice separation from monaural recordings in a supervised setting. Expand
  • 105
  • 13
  • PDF
Bitwise Neural Networks
TLDR
We propose a process for developing and deploying neural networks whose weight parameters, bias terms, input, and intermediate hidden layer output signals, are all binary-valued, and require only basic bit logic for the feedforward pass. Expand
  • 155
  • 7
  • PDF
Mixtures of Local Dictionaries for Unsupervised Speech Enhancement
TLDR
We propose a novel extension of Nonnegative Matrix Factorization (NMF) that models a signal with multiple local dictionaries activated sparsely. Expand
  • 42
  • 5
  • PDF
Nonnegative matrix partial co-factorization for drum source separation
TLDR
We present nonnegative matrix partial co-factorization (NMPCF) where the target matrix (spectrograms of music) and drum-only-matrix (collected from various drums) are simultaneously decomposed, sharing some factor matrix partially, to force some portion of basis vectors to be associated with drums only. Expand
  • 57
  • 5
  • PDF
Experiments on deep learning for speech denoising
TLDR
We propose a very lightweight procedure that can predict clean speech spectra when presented with noisy speech inputs, and we show how various parameter choices impact the quality of the denoised signal. Expand
  • 79
  • 4
  • PDF
XNOR-POP: A processing-in-memory architecture for binary Convolutional Neural Networks in Wide-IO2 DRAMs
TLDR
In this paper, we present a novel process-in-memory architecture to process emerging binary CNN tests in Wide-IO2 DRAMs. Expand
  • 37
  • 4
  • PDF
Nonnegative Matrix Partial Co-Factorization for Spectral and Temporal Drum Source Separation
TLDR
We address a problem of separating drum sources from monaural mixtures of polyphonic music containing various pitched instruments as well as drums. Expand
  • 51
  • 3
  • PDF
Collaborative Deep Learning for speech enhancement: A run-time model selection method using autoencoders
  • Minje Kim
  • Computer Science
  • IEEE International Conference on Acoustics…
  • 5 March 2017
TLDR
We show that a Modular Neural Network (MNN) can combine various speech enhancement modules that are specialized on dealing with a specific noise type, gender, and input Signal-to-Noise Ratio. Expand
  • 13
  • 3
  • PDF