Corpus ID: 209460630

A Cycle-GAN Approach to Model Natural Perturbations in Speech for ASR Applications

@article{Dumpala2019ACA,
  title={A Cycle-GAN Approach to Model Natural Perturbations in Speech for ASR Applications},
  author={Sri Harsha Dumpala and Imran Sheikh and Rupayan Chakraborty and Sunil K. Kopparapu},
  journal={ArXiv},
  year={2019},
  volume={abs/1912.11151}
}
  • Sri Harsha Dumpala, Imran Sheikh, +1 author Sunil K. Kopparapu
  • Published 2019
  • Computer Science, Engineering
  • ArXiv
  • Naturally introduced perturbations in audio signal, caused by emotional and physical states of the speaker, can significantly degrade the performance of Automatic Speech Recognition (ASR) systems. In this paper, we propose a front-end based on Cycle-Consistent Generative Adversarial Network (CycleGAN) which transforms naturally perturbed speech into normal speech, and hence improves the robustness of an ASR system. The CycleGAN model is trained on non-parallel examples of perturbed and normal… CONTINUE READING

    Figures, Tables, and Topics from this paper.

    Explore Further: Topics Discussed in This Paper

    Citations

    Publications citing this paper.

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 33 REFERENCES

    JHU ASpIRE system: Robust LVCSR with TDNNS, iVector adaptation and RNN-LMS

    Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks

    VIEW 4 EXCERPTS
    HIGHLY INFLUENTIAL

    Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

    VIEW 1 EXCERPT

    The Microsoft 2017 Conversational Speech Recognition System

    • W. Xiong, L. Wu, +3 authors Andreas Stolcke
    • Computer Science
    • 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
    • 2018
    VIEW 1 EXCERPT

    Analysis of laughter and speech-laugh signals using excitation source information

    VIEW 1 EXCERPT