Hung-Yan Gu

Learn More
Ahtract-This paper describes the first successfully implemented real-time Mandarin dictation machine developed in the world which recognizes Mandarin speech with very large vocabulary and almost unlimited texts for the input of Chinese characters into computers. Considering the special characteristics of the Chinese language, syllables are chosen as the(More)
In many applications, Chinese information is very often provided in the form of phonetic symbol sequences, and it is desired to decode such sequences into the corresponding Chinese character sequences (sentences) as the output. Phonetic input of Chinese characters into computers is a typical example. The problem is due primarily to the high degree of(More)
—In this paper, an approach that combines HMM spectrum models and ANN prosody models is proposed to construct a speech synthesis system. Currently, a Mandarin corpus is used to show the feasibility of this approach. We hope that this approach can be used in other syllable prominent languages like Min-Nan and Hakka. In the training phase, DCC (discrete(More)
In this paper, HNM (harmonic plus noise model) is enhanced and used to design a scheme for synthesizing a Mandarin Chinese singing voice. Enhancements made include a Lagrange-interpolation based estimation of spectral envelope, piecewise linear mapping of time axes, fixed-pace placement of control points, and other modifications for analyzing HNM parameters(More)
In this paper, a method with sentence-wide optimization consideration is proposed to generate a Mandarin sentence's pitch-contour. The developed model is called the sentence pitch-contour HMM (SPC-HMM) due to its use of VQ (vector quantization) and HMM (hidden Markov model). To construct an SPC-HMM, the pitch-contours of the syllables from each training(More)
In this paper, a framework for integrated synthesis of Mandarin, Min-nan, and Hakka speech is proposed. To show its feasibility, an initial integrated system has been built as well. Through integration, a model only trained with Min-nan sentences is used to generate pitch-contours for all three languages, same rules are used to generate syllable duration(More)