On the impact of phoneme alignment in DNN-based speech synthesis

Abstract

Recently, deep neural networks (DNNs) have significantly improved the performance of acoustic modeling in statistical parametric speech synthesis (SPSS). However, in current implementations, when training a DNN-based speech synthesis system, phonetic transcripts are required to be aligned with the corresponding speech frames to obtain the phonetic… (More)
DOI: 10.21437/SSW.2016-32

6 Figures and Tables

Topics

  • Presentations referencing similar topics