Mixture of Inference Networks for VAE-Based Audio-Visual Speech Enhancement

@article{Sadeghi2021MixtureOI,
  title={Mixture of Inference Networks for VAE-Based Audio-Visual Speech Enhancement},
  author={M. Sadeghi and Xavier Alameda-Pineda},
  journal={IEEE Transactions on Signal Processing},
  year={2021},
  volume={69},
  pages={1899-1909}
}
We address unsupervised audio-visual speech enhancement based on variational autoencoders (VAEs), where the prior distribution of clean speech spectrogram is simulated using an encoder-decoder architecture. At enhancement (test) time, the trained generative model (decoder) is combined with a noise model whose parameters need to be estimated. The initialization of the latent variables describing the generative process of the clean speech via the decoder, is crucial, as the overall inference… Expand
3 Citations
Deep Variational Generative Models for Audio-visual Speech Separation
  • 2
  • PDF
Switching Variational Auto-Encoders for Noise-Agnostic Audio-visual Speech Enhancement
  • 1
  • PDF
An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
  • 11
  • PDF

References

SHOWING 1-10 OF 47 REFERENCES
Robust Unsupervised Audio-Visual Speech Enhancement Using a Mixture of Variational Autoencoders
  • M. Sadeghi, Xavier Alameda-Pineda
  • Engineering, Computer Science
  • ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
  • 6
  • PDF
A VARIANCE MODELING FRAMEWORK BASED ON VARIATIONAL AUTOENCODERS FOR SPEECH ENHANCEMENT
  • 39
  • Highly Influential
  • PDF
Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization
  • 53
  • Highly Influential
  • PDF
Speech Enhancement with Variational Autoencoders and Alpha-stable Distributions
  • 22
  • PDF
Bayesian Multichannel Speech Enhancement with a Deep Speech Prior
  • 23
  • PDF
Visual Speech Enhancement
  • 61
  • PDF
A Regression Approach to Speech Enhancement Based on Deep Neural Networks
  • 755
  • PDF
Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization
  • 328
  • PDF
Audio-visual enhancement of speech in noise.
  • 131
  • PDF
...
1
2
3
4
5
...