Corpus ID: 235755165

Adversarial Auto-Encoding for Packet Loss Concealment

  title={Adversarial Auto-Encoding for Packet Loss Concealment},
  author={Santiago Pascual and Joan Serr{\`a} and Jordi Pons},
Communication technologies like voice over IP operate under constrained real-time conditions, with voice packets being subject to delays and losses from the network. In such cases, the packet loss concealment (PLC) algorithm reconstructs missing frames until a new real packet is received. Recently, autoregressive deep neural networks have been shown to surpass the quality of signal processing methods for PLC, specially for long-term predictions beyond 60 ms. In this work, we propose a non… Expand


On Deep Speech Packet Loss Concealment: A Mini-Survey
This mini-survey reviews all the literature to date, that attempt to solve the packet-loss in speech using deep learning methods, and briefly reviews how the problem of packet- loss in a realistic setting is modelled, and how to evaluate Packet Loss Concealment techniques. Expand
Speech Loss Compensation by Generative Adversarial Networks
A generative adversarial networks structure, which takes deep convolutional neural networks as the generator and discriminator components, is adopted as a general framework for speech loss compensation and achieves better performance compared to the baseline systems. Expand
Packet Loss Concealment Based on Deep Neural Networks for Digital Speech Transmission
The proposed regression-based packet loss concealment for digital speech transmission by using deep neural networks (DNNs) with a multiple-layer deep architecture provides better speech quality and speech recognition accuracy than the conventional approaches. Expand
High Fidelity Speech Synthesis with Adversarial Networks
GAN-TTS is capable of generating high-fidelity speech with naturalness comparable to the state-of-the-art models, and unlike autoregressive models, it is highly parallelisable thanks to an efficient feed-forward generator. Expand
Hidden Markov model-based packet loss concealment for voice over IP
With a hidden Markov model (HMM) tracking the evolution of speech signal parameters, it is demonstrated how PLC is performed within a statistical signal processing framework and how the HMM is used to index a specially designed PLC module for the particular signal context, leading to signal-contingent PLC. Expand
Speech Prediction Using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment
The proposed predictor is a single end-to-end network that captures all sorts of dependencies between samples, and therefore has the potential to outperform classicallinear/non-linear and short-termllong-term speech predictor structures. Expand
Efficient, end-to-end and self-supervised methods for speech processing and generation
This thesis proposes the use of recent pseudo-recurrent structures, like self-attention models and quasi- Recurrent networks, to build acoustic models for text-to-speech and proposes a problem-agnostic speech encoder, named PASE, which is a fully convolutional network that yields compact representations from speech waveforms. Expand
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
The model is non-autoregressive, fully convolutional, with significantly fewer parameters than competing models and generalizes to unseen speakers for mel-spectrogram inversion, and suggests a set of guidelines to design general purpose discriminators and generators for conditional sequence synthesis tasks. Expand
Least Squares Generative Adversarial Networks
This paper proposes the Least Squares Generative Adversarial Networks (LSGANs) which adopt the least squares loss function for the discriminator, and shows that minimizing the objective function of LSGAN yields minimizing the Pearson X2 divergence. Expand
Image-to-Image Translation with Conditional Adversarial Networks
Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Expand