On the impact of phoneme alignment in DNN-based speech synthesis


Recently, deep neural networks (DNNs) have significantly improved the performance of acoustic modeling in statistical parametric speech synthesis (SPSS). However, in current implementations, when training a DNN-based speech synthesis system, phonetic transcripts are required to be aligned with the corresponding speech frames to obtain the phonetic… (More)
DOI: 10.21437/SSW.2016-32

6 Figures and Tables


