Voice-transformation-based data augmentation for prosodic classification

  title={Voice-transformation-based data augmentation for prosodic classification},
  author={Raul Fernandez and Andrew Rosenberg and Alexander Sorin and Bhuvana Ramabhadran and Ron Hoory},
  journal={2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  • Raul Fernandez, A. Rosenberg, R. Hoory
  • Published 1 March 2017
  • Computer Science
  • 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
In this work we explore data-augmentation techniques for the task of improving the performance of a supervised recurrent-neural-network classifier tasked with predicting prosodic-boundary and pitch-accent labels. The technique is based on applying voice transformations to the training data that modify the pitch baseline and range, as well as the vocal-tract and vocal-source characteristics of the speakers to generate further training examples. We demonstrate the validity of the approach by… 

