A Fully Convolutional Neural Network for Speech Enhancement
@inproceedings{Park2017AFC, title={A Fully Convolutional Neural Network for Speech Enhancement}, author={Se Rim Park and Jinwon Lee}, booktitle={INTERSPEECH}, year={2017} }
In hearing aids, the presence of babble noise degrades hearing intelligibility of human speech greatly. However, removing the babble without creating artifacts in human speech is a challenging task in a low SNR environment. Here, we sought to solve the problem by finding a `mapping' between noisy speech spectra and clean speech spectra via supervised learning. Specifically, we propose using fully Convolutional Neural Networks, which consist of lesser number of parameters than fully connected…Â
218 Citations
A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement
- Computer ScienceINTERSPEECH
- 2018
This paper incorporates a convolutional encoderdecoder (CED) and long short-term memory (LSTM) into the CRN architecture, which leads to a causal system that is naturally suitable for real-time processing.
A Fully Convolutional Neural Network for Complex Spectrogram Processing in Speech Enhancement
- Computer ScienceICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2019
The proposed CNN consists of one-dimensional (1-d) convolution and frequency-dilated 2-d convolution, and incorporates a residual learning and skip-connection structure, and achieves a better performance with fewer parameters.
Speech Enhancement using Convolutional Neural Network with Skip Connections
- Computer Science2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP)
- 2018
Experimental results demonstrate that the proposed CNN structure provides better denoising ability than Wiener filtering in noise reduction even when the model was tested using the data and noise set not included in the training set.
Regression-based speech enhancement by convolutional neural network
- Physics2018 26th Signal Processing and Communications Applications Conference (SIU)
- 2018
A regression-based convolutional neural network model is proposed for speech enhancement to remove the noise on the conversations and the results are evaluated by perceptual evaluation of speech quality and short time objective intelligibility.
Separated Noise Suppression and Speech Restoration: Lstm-Based Speech Enhancement in Two Stages
- Computer Science2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
- 2019
This work proposes to address the problem of speech distortions can be introduced when employing NNs trained to provide strong noise suppression by first suppressing noise and subsequently restoring speech with specifically chosen NN topologies for each of these distinct tasks.
Speech Denoising with Auditory Models
- Computer ScienceInterspeech
- 2021
The results show that deep features can guide speech enhancement, but suggest that they do not yet outperform simple alternatives that do not involve learned features.
Redundant Convolutional Network With Attention Mechanism For Monaural Speech Enhancement
- Computer ScienceICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020
This study introduces an attention mechanism into the convolutional encoderdecoder model that adaptively filters channelwise feature responses by explicitly modeling attentions (on speech versus noise signals) between channels.
Gated Residual Networks with Dilated Convolutions for Supervised Speech Separation
- Computer Science2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
This work proposes a novel convolutional neural network (CNN) to deal with noise- and speaker-independent speech separation and finds that the proposed model consistently outperforms a state-of-the-art long short-term memory (LSTM) based model in terms of objective speech intelligibility and quality.
Speech Enhancement via Deep Spectrum Image Translation Network
- Computer Science2019 26th National and 4th International Iranian Conference on Biomedical Engineering (ICBME)
- 2019
A novel speech enhancement approach using a deep spectrum image translation network where a deep fully convolutional network known as VGG19 is embedded at the encoder part of an image-to-image translation network, i.e. U-Net is suggested.
Speech Enhancement by Multiple Propagation through the Same Neural Network
- Computer ScienceSensors
- 2022
Previous efforts are extended and demonstrated how the multi-forward-pass speech enhancement can be successfully applied to other architectures, namely the ResBLSTM and Transformer-Net and the results show that performing speech enhancement up to five times still brings improvements to speech intelligibility, but the gain becomes smaller with each iteration.
References
SHOWING 1-10 OF 31 REFERENCES
Learning spectral mapping for speech dereverberation
- Physics, Computer Science2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2014
It is demonstrated that distortion caused by reverberation is substantially attenuated by the DNN whose outputs can be resynthesized to the dereverebrated speech signal.
A Regression Approach to Speech Enhancement Based on Deep Neural Networks
- Computer ScienceIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2015
The proposed DNN approach can well suppress highly nonstationary noise, which is tough to handle in general, and is effective in dealing with noisy speech data recorded in real-world scenarios without the generation of the annoying musical artifact commonly observed in conventional enhancement methods.
Convolutional Neural Networks for Speech Recognition
- Computer ScienceIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2014
It is shown that further error rate reduction can be obtained by using convolutional neural networks (CNNs), and a limited-weight-sharing scheme is proposed that can better model speech features.
Complex recurrent neural networks for denoising speech signals
- Computer Science, Engineering2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
- 2015
Noise reduction experiments on noisy speech, both with digitally added synthetic noise and real car noise, show that the proposed algorithm can recover much of the degradation caused by the noise.
Speech enhancement with weighted denoising auto-encoder
- Computer ScienceINTERSPEECH
- 2013
A novel speech enhancement method with Weighted Denoising Auto-encoder (WDA) is proposed, which could achieve similar amount of noise reduction in both white and colored noise, and the distortion on the level of speech signal is smaller.
Enhancement and bandwidth compression of noisy speech
- Computer ScienceProceedings of the IEEE
- 1979
An overview of the variety of techniques that have been proposed for enhancement and bandwidth compression of speech degraded by additive background noise is provided to suggest a unifying framework in terms of which the relationships between these systems is more visible and which hopefully provides a structure which will suggest fruitful directions for further research.
Suppression of acoustic noise in speech using spectral subtraction
- Physics
- 1979
A stand-alone noise suppression algorithm that resynthesizes a speech waveform and can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.
Babble Noise: Modeling, Analysis, and Applications
- PhysicsIEEE Transactions on Audio, Speech, and Language Processing
- 2009
This study represents effectively the first effort in developing an overall model for speech babble, and with this, contributions are made for speech system robustness in noise.
A short-time objective intelligibility measure for time-frequency weighted noisy speech
- Physics2010 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2010
An objective intelligibility measure is presented, which shows high correlation (rho=0.95) with the intelligibility of both noisy, and TF-weighted noisy speech, and shows significantly better performance than three other, more sophisticated, objective measures.
A signal subspace approach for speech enhancement
- Computer ScienceIEEE Trans. Speech Audio Process.
- 1995
The popular spectral subtraction speech enhancement approach is shown to be a signal subspace approach which is optimal in an asymptotic (large sample) linear minimum mean square error sense, assuming the signal and noise are stationary.