HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

@inproceedings{Su2020HiFiGANHD,
  title={HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks},
  author={Jiaqi Su and Zeyu Jin and A. Finkelstein},
  booktitle={INTERSPEECH},
  year={2020}
}
Real-world audio recordings are often degraded by factors such as noise, reverberation, and equalization distortion. This paper introduces HiFi-GAN, a deep learning method to transform recorded speech to sound as though it had been recorded in a studio. We use an end-to-end feed-forward WaveNet architecture, trained with multi-scale adversarial discriminators in both the time domain and the time-frequency domain. It relies on the deep feature matching losses of the discriminators to improve the… Expand
10 Citations

Figures and Tables from this paper

Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model
  • PDF
Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple Domains
  • PDF
High Fidelity Speech Regeneration with Application to Speech Enhancement
  • 1
  • PDF
CDPAM: Contrastive learning for perceptual audio similarity
  • PDF
MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement
  • S. Fu, Cheng Yu, +4 authors Yu Tsao
  • Computer Science, Engineering
  • ArXiv
  • 2021
  • PDF
Perceptually Guided End-to-End Text-to-Speech
  • PDF
All for One and One for All: Improving Music Separation by Bridging Networks
  • PDF
Improving Perceptual Quality by Phone-Fortified Perceptual Loss for Speech Enhancement
  • Highly Influenced
  • PDF
Context-Aware Prosody Correction for Text-Based Speech Editing
  • PDF

References

SHOWING 1-10 OF 43 REFERENCES
High Fidelity Speech Synthesis with Adversarial Networks
  • 51
  • PDF
SEGAN: Speech Enhancement Generative Adversarial Network
  • 532
  • PDF
Towards Generalized Speech Enhancement with Generative Adversarial Networks
  • 10
  • PDF
Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition
  • 104
  • PDF
Speech Denoising with Deep Feature Losses
  • 73
  • PDF
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
  • 134
  • PDF
Perceptually-motivated Environment-specific Speech Enhancement
  • Jiaqi Su, A. Finkelstein, Zeyu Jin
  • Computer Science
  • ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2019
  • 5
  • PDF
Impulse Response Data Augmentation and Deep Neural Networks for Blind Room Acoustic Parameter Estimation
  • Nicholas J. Bryan
  • Computer Science
  • ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
  • 9
  • PDF
Learning Spectral Mapping for Speech Dereverberation and Denoising
  • 105
  • PDF
Improving GANs for Speech Enhancement
  • 12
  • PDF
...
1
2
3
4
5
...