• Corpus ID: 56895372

Tensor-Train Long Short-Term Memory for Monaural Speech Enhancement

@article{Samui2018TensorTrainLS,
  title={Tensor-Train Long Short-Term Memory for Monaural Speech Enhancement},
  author={Suman Samui and Indrajit Chakrabarti and Soumya K. Ghosh},
  journal={ArXiv},
  year={2018},
  volume={abs/1812.10095}
}
In recent years, Long Short-Term Memory (LSTM) has become a popular choice for speech separation and speech enhancement task. The capability of LSTM network can be enhanced by widening and adding more layers. However, this would introduce millions of parameters in the network and also increase the requirement of computational resources. These limitations hinders the efficient implementation of RNN models in low-end devices such as mobile phones and embedded systems with limited memory. To… 

Figures and Tables from this paper

Shared Network for Speech Enhancement Based on Multi-Task Learning
TLDR
This work proposes a two-stage based method called ShareNet that first train a convolutional neural network to perform noise reduction, and then stack these two pretrained blocks while keeping the parameters shared to perform both denoising and repairing tasks.
Application of Tensor Train Decomposition in S2VT Model for Sign Language Recognition
TLDR
The experimental results demonstrated that when the fully-connected layer and the first LSTM layer in S2VT was represented with tensor-train format, the model could obtain the best performance, remaining high accuracy and reducing parameters and memory significantly.
Compressing Deep Neural Networks for Efficient Speech Enhancement
  • Ke Tan, Deliang Wang
  • Computer Science
    ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2021
TLDR
It is found that training and compressing a large DNN yields higher STOI and PESQ than directly training a small DNN that has a comparable size to the compressed DNN, which further suggests the benefits of using the proposed model compression approach.
Sınavı, Zaman Serisi Model, Matematik PREDICTION OF NUMBERS QUESTION MATHEMATICS IN THE UNIVERSITY ENTRANCE EXAM BY TOPICS WITH LSTM-BASED DEEP NEURAL NETWORK
TLDR
The future studies will provide the development of robots to predict because of the high accuracy of the model, which was proposed for estimating the number of questions in the long short term memory (LSTM) deep neural network and the mathematics test in the university entrance exam.
Classification of BMI States from Foot Base Pressure with LSTM-based Deep Neural Network
TLDR
The proposed classification model is verified by the classification of the foot pressure data according to the status of obesity determined by the individual's BMI, which has been observed to provide high accuracy performance in classification operations.

References

SHOWING 1-10 OF 22 REFERENCES
Long short-term memory for speaker generalization in supervised speech separation.
TLDR
A separation model based on long short-term memory (LSTM) is proposed, which naturally accounts for temporal dynamics of speech and which substantially outperforms a DNN-based model on unseen speakers and unseen noises in terms of objective speech intelligibility.
Deep Recurrent Neural Network Based Monaural Speech Separation Using Recurrent Temporal Restricted Boltzmann Machines
TLDR
Experimental results have established that the proposed approach outperforms NMF and conventional DNN and DRNN-based speech enhancement methods.
Speech recognition with deep recurrent neural networks
TLDR
This paper investigates deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs.
Tensorizing Neural Networks
TLDR
This paper converts the dense weight matrices of the fully-connected layers to the Tensor Train format such that the number of parameters is reduced by a huge factor and at the same time the expressive power of the layer is preserved.
LSTM: A Search Space Odyssey
TLDR
This paper presents the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling, and observes that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.
On Training Targets for Supervised Speech Separation
TLDR
Results in various test conditions reveal that the two ratio mask targets, the IRM and the FFT-MASK, outperform the other targets in terms of objective intelligibility and quality metrics, and that masking based targets, in general, are significantly better than spectral envelope based targets.
Speech Intelligibility Potential of General and Specialized Deep Neural Network Based Speech Enhancement Systems
TLDR
It is shown that DNN-based SE systems, when trained specifically to handle certain speakers, noise types and SNRs, are capable of achieving large improvements in estimated speech quality (SQ) and speech intelligibility (SI), when tested in matched conditions.
Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection
TLDR
A new VAD algorithm based on boosted deep neural networks (bDNNs) is described that outperforms state-of-the-art VADs by a considerable margin and employs a new acoustic feature, multi-resolution cochleagram (MRCG), that concatenates the cochreagram features at multiple spectrotemporal resolutions and shows superior speech separation results over many acoustic features.
Long Short-Term Memory
TLDR
A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
A Critical Review of Recurrent Neural Networks for Sequence Learning
TLDR
The goal of this survey is to provide a selfcontained explication of the state of the art of recurrent neural networks together with a historical perspective and references to primary research.
...
1
2
3
...