Online Non-Negative Convolutive Pattern Learning for Speech Signals
@article{Wang2013OnlineNC, title={Online Non-Negative Convolutive Pattern Learning for Speech Signals}, author={Dong Wang and Ravichander Vipperla and Nicholas W. D. Evans and Thomas Fang Zheng}, journal={IEEE Transactions on Signal Processing}, year={2013}, volume={61}, pages={44-56} }
The unsupervised learning of spectro-temporal patterns within speech signals is of interest in a broad range of applications. Where patterns are non-negative and convolutive in nature, relevant learning algorithms include convolutive non-negative matrix factorization (CNMF) and its sparse alternative, convolutive non-negative sparse coding (CNSC). Both algorithms, however, place unrealistic demands on computing power and memory which prohibit their application in large scale tasks. This paper…
Figures from this paper
25 Citations
Supervised speech enhancement using online Group-Sparse Convolutive NMF
- Computer Science2016 8th International Symposium on Telecommunications (IST)
- 2016
The results of the proposed algorithm show that using online Group-Sparse Convolutive NMF algorithm can significantly increase the enhanced clean speech PESQ.
Online learning of time-frequency patterns
- Computer Science2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2017
An online method to learn recurring time-frequency patterns from spectrograms using a first-order stochastic gradient descent and a monotonically decreasing learning-rate that is suitable to handle a large amount of data is presented.
Robust Non‐negative Matrix Factorization with β‐Divergence for Speech Separation
- Computer Science
- 2017
Experimental speech separation results show that the proposed convolutional RNMF successfully separates the repeating time‐varying spectral structures from the magnitude spectrum of the mixture, and does so without any prior training.
LEARNING OF TIME-FREQUENCY PATTERNS
- Computer Science
- 2017
An online method to learn recurring timefrequency patterns from spectrograms relies on a convolutive decomposition that estimates sequences of spectra into time-frequency patterns and their corresponding activation signals and is suitable to handle a large amount of data.
Robust Hierarchical Learning for Non-Negative Matrix Factorization With Outliers
- Computer ScienceIEEE Access
- 2019
A novel approach is proposed that provides robustness in the presence of noises and outliers, ease of implementation, and the guarantee of convergence by extending the automatic relevance determination framework in NMF from Tan and Févotte by developing majorization–minimization algorithms.
Deep and Sparse Learning in Speech and Language Processing: An Overview
- Computer ScienceBICS
- 2016
An overview of growing interest in a unified Sparse Deep or Deep Sparse learning framework is provided, and future research possibilities in this multi-disciplinary area are outlined.
Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuroscience
- Biology, Computer Science
- 2018
This work proposes a tool that extends a convolutional NMF technique to prevent its common failure modes, and provides a framework for extracting sequences from a dataset, and is easily cross-validated to assess the significance of each extracted factor.
Speech Enhancement Under Low SNR Conditions Via Noise Estimation Using Sparse and Low-Rank NMF with Kullback–Leibler Divergence
- Computer ScienceIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2015
Without any prior knowledge of speech and noise, sparse and low-rank nonnegative matrix factorization (NMF) with Kullback-Leibler divergence is proposed to noise and speech estimation by decomposing the input noisy magnitude spectrogram into a low- rank noise part and a sparse speech-like part.
Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuroscience
- Biology, Computer ScienceeLife
- 2019
By identifying temporal structure directly from neural data, seqNMF enables dissection of complex neural circuits without relying on temporal references from stimuli or behavioral outputs.
Can We Trust Deep Speech Prior?
- Computer Science2021 IEEE Spoken Language Technology Workshop (SLT)
- 2021
A comprehensive study demonstrated that based on deep speech priors, a reasonable SE performance can be achieved, but the results might be suboptimal.
References
SHOWING 1-10 OF 76 REFERENCES
Convolutive non-negative sparse coding
- Computer Science2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)
- 2008
This paper has developed an effective learning algorithm based on the multiplicative adaptation of the reconstruction error function defined by the squared Euclidean distance that is applied to the separation of music audio objects in the magnitude spectrum domain.
Robust speech recognition in multi-source noise environments using convolutive non-negative matrix factorization
- Computer Science, Physics
- 2011
It is shown that background noise can be effectively attenuated from noisy speech by learning the noise bases from several hours of ambient noise data and over a few seconds of local acoustic context.
Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria
- Computer ScienceIEEE Transactions on Audio, Speech, and Language Processing
- 2007
An unsupervised learning algorithm for the separation of sound sources in one-channel music signals is presented and enables a better separation quality than the previous algorithms.
Online Learning for Matrix Factorization and Sparse Coding
- Computer ScienceJ. Mach. Learn. Res.
- 2010
A new online optimization algorithm is proposed, based on stochastic approximations, which scales up gracefully to large data sets with millions of training samples, and extends naturally to various matrix factorization formulations, making it suitable for a wide range of learning problems.
Online algorithms for nonnegative matrix factorization with the Itakura-Saito divergence
- Computer Science2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
- 2011
This work provides an online algorithm with a complexity of O(FK) in time and memory for updates in the dictionary for nonnegative matrix factorization for audio source separation.
Speech enhancement with sparse coding in learned dictionaries
- Computer Science2010 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2010
This work presents a monaural speech enhancement method based on sparse coding of noisy speech signals in a composite dictionary, consisting of the concatenation of a speech and interferer dictionary, both being possibly over-complete.
Convolutive Non-Negative Matrix Factorisation with a Sparseness Constraint
- Computer Science
- 2006
This work presents an extension to NMF that is convolutive and includes a sparseness constraint, and in combination with a spectral magnitude transform, this method discovers auditory objects and their associated sparse activation patterns.
Convolutive Non-Negative Matrix Factorisation with a Sparseness Constraint
- Computer Science2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing
- 2006
Non-negative Matrix Factorization with Quasi-Newton Optimization
- Computer ScienceICAISC
- 2006
This work derived a relatively simple second-order quasi-Newton method for NMF: so-called Amari alpha divergence, which has been extensively tested for blind source separation problems, both for signals and images.
Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription
- Computer ScienceIEEE Transactions on Audio, Speech, and Language Processing
- 2010
Bayesian NMF with harmonicity and temporal continuity constraints is shown to outperform other standard NMF-based transcription systems, providing a meaningful mid-level representation of the data.