Decha Moungsri

  • Citations Per Year
Learn More
This paper introduces frame-based Gaussian process regression (GPR) into phone/syllable duration modeling for Thai speech synthesis. The GPR model is designed for predicting framelevel acoustic features using corresponding frame information, which includes relative position in each unit of utterance structure and linguistic information such as tone type and(More)
This paper describes an approach to HMM-based Thai speech synthesis using stress context. It has been shown that context related to stressed/unstressed syllable information (stress context) significantly improves the tone correctness of the synthetic speech, but there is a problem of requiring a manual context labeling process in tone modeling. To reduce(More)
This paper proposes an alternative multi-level approach to duration prediction for improving prosody generation in statistical parametric speech synthesis using multiple Gaussian process experts. We use two duration models at different levels, specifically, syllable and phone. First, we individually train syllable- and phone-level duration models. Then, the(More)
In continuous speech of Thai language, tone pronunciation is affected by several factors. One of significant factors is stress that causes a diversity of F0 contours of tone, and affects syllable durations. Our previous studies have shown that a stressed/unstressed syllable context improves tone modeling accuracy. However, the stress in Thai language is(More)
In Thai language, stress is an important prosodic feature that not only affects naturalness but also has a crucial role in meaning of phrase-level utterance. It is seen that a speech synthesis model that is trained with lack of stress and phrase-level information causes incorrect tones and ambiguity in meaning of synthetic speech. Our previous work has(More)
  • 1