Online Non-Negative Convolutive Pattern Learning for Speech Signals

@article{Wang2013OnlineNC,
  title={Online Non-Negative Convolutive Pattern Learning for Speech Signals},
  author={Dong Wang and Ravichander Vipperla and Nicholas W. D. Evans and Thomas Fang Zheng},
  journal={IEEE Transactions on Signal Processing},
  year={2013},
  volume={61},
  pages={44-56}
}
The unsupervised learning of spectro-temporal patterns within speech signals is of interest in a broad range of applications. Where patterns are non-negative and convolutive in nature, relevant learning algorithms include convolutive non-negative matrix factorization (CNMF) and its sparse alternative, convolutive non-negative sparse coding (CNSC). Both algorithms, however, place unrealistic demands on computing power and memory which prohibit their application in large scale tasks. This paper… 
Supervised speech enhancement using online Group-Sparse Convolutive NMF
TLDR
The results of the proposed algorithm show that using online Group-Sparse Convolutive NMF algorithm can significantly increase the enhanced clean speech PESQ.
Online learning of time-frequency patterns
TLDR
An online method to learn recurring time-frequency patterns from spectrograms using a first-order stochastic gradient descent and a monotonically decreasing learning-rate that is suitable to handle a large amount of data is presented.
Robust Non‐negative Matrix Factorization with β‐Divergence for Speech Separation
TLDR
Experimental speech separation results show that the proposed convolutional RNMF successfully separates the repeating time‐varying spectral structures from the magnitude spectrum of the mixture, and does so without any prior training.
LEARNING OF TIME-FREQUENCY PATTERNS
TLDR
An online method to learn recurring timefrequency patterns from spectrograms relies on a convolutive decomposition that estimates sequences of spectra into time-frequency patterns and their corresponding activation signals and is suitable to handle a large amount of data.
Robust Hierarchical Learning for Non-Negative Matrix Factorization With Outliers
TLDR
A novel approach is proposed that provides robustness in the presence of noises and outliers, ease of implementation, and the guarantee of convergence by extending the automatic relevance determination framework in NMF from Tan and Févotte by developing majorization–minimization algorithms.
Deep and Sparse Learning in Speech and Language Processing: An Overview
TLDR
An overview of growing interest in a unified Sparse Deep or Deep Sparse learning framework is provided, and future research possibilities in this multi-disciplinary area are outlined.
Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuroscience
TLDR
This work proposes a tool that extends a convolutional NMF technique to prevent its common failure modes, and provides a framework for extracting sequences from a dataset, and is easily cross-validated to assess the significance of each extracted factor.
Speech Enhancement Under Low SNR Conditions Via Noise Estimation Using Sparse and Low-Rank NMF with Kullback–Leibler Divergence
TLDR
Without any prior knowledge of speech and noise, sparse and low-rank nonnegative matrix factorization (NMF) with Kullback-Leibler divergence is proposed to noise and speech estimation by decomposing the input noisy magnitude spectrogram into a low- rank noise part and a sparse speech-like part.
Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuroscience
TLDR
By identifying temporal structure directly from neural data, seqNMF enables dissection of complex neural circuits without relying on temporal references from stimuli or behavioral outputs.
Can We Trust Deep Speech Prior?
TLDR
A comprehensive study demonstrated that based on deep speech priors, a reasonable SE performance can be achieved, but the results might be suboptimal.
...
1
2
3
...

References

SHOWING 1-10 OF 76 REFERENCES
Convolutive non-negative sparse coding
  • Wenwu Wang
  • Computer Science
    2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)
  • 2008
TLDR
This paper has developed an effective learning algorithm based on the multiplicative adaptation of the reconstruction error function defined by the squared Euclidean distance that is applied to the separation of music audio objects in the magnitude spectrum domain.
Robust speech recognition in multi-source noise environments using convolutive non-negative matrix factorization
TLDR
It is shown that background noise can be effectively attenuated from noisy speech by learning the noise bases from several hours of ambient noise data and over a few seconds of local acoustic context.
Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria
  • T. Virtanen
  • Computer Science
    IEEE Transactions on Audio, Speech, and Language Processing
  • 2007
TLDR
An unsupervised learning algorithm for the separation of sound sources in one-channel music signals is presented and enables a better separation quality than the previous algorithms.
Online Learning for Matrix Factorization and Sparse Coding
TLDR
A new online optimization algorithm is proposed, based on stochastic approximations, which scales up gracefully to large data sets with millions of training samples, and extends naturally to various matrix factorization formulations, making it suitable for a wide range of learning problems.
Online algorithms for nonnegative matrix factorization with the Itakura-Saito divergence
TLDR
This work provides an online algorithm with a complexity of O(FK) in time and memory for updates in the dictionary for nonnegative matrix factorization for audio source separation.
Speech enhancement with sparse coding in learned dictionaries
TLDR
This work presents a monaural speech enhancement method based on sparse coding of noisy speech signals in a composite dictionary, consisting of the concatenation of a speech and interferer dictionary, both being possibly over-complete.
Convolutive Non-Negative Matrix Factorisation with a Sparseness Constraint
TLDR
This work presents an extension to NMF that is convolutive and includes a sparseness constraint, and in combination with a spectral magnitude transform, this method discovers auditory objects and their associated sparse activation patterns.
Convolutive Non-Negative Matrix Factorisation with a Sparseness Constraint
Non-negative Matrix Factorization with Quasi-Newton Optimization
TLDR
This work derived a relatively simple second-order quasi-Newton method for NMF: so-called Amari alpha divergence, which has been extensively tested for blind source separation problems, both for signals and images.
Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription
TLDR
Bayesian NMF with harmonicity and temporal continuity constraints is shown to outperform other standard NMF-based transcription systems, providing a meaningful mid-level representation of the data.
...
1
2
3
4
5
...