Chi-Chun Hsia

Learn More
—This paper presents an approach to hierarchical prosody conversion for emotional speech synthesis. The pitch contour of the source speech is decomposed into a hierarchical prosodic structure consisting of sentence, prosodic word, and subsyllable levels. The pitch contour in the higher level is encoded by the discrete Legendre polynomial coefficients. The(More)
—This paper presents an expressive voice conversion model (DeBi-HMM) as the post processing of a text-to-speech (TTS) system for expressive speech synthesis. DeBi-HMM is named for its duration-embedded characteristic of the two HMMs for modeling the source and target speech signals, respectively. Joint estimation of source and target HMMs is exploited for(More)
—This paper proposes a method for modeling and generating pitch in hidden Markov model (HMM)-based Mandarin speech synthesis by exploiting prosody hierarchy and dynamic pitch features. The prosodic structure of a sentence is represented by a prosody hierarchy, which is constructed from the predicted prosodic breaks using a supervised classification and(More)
—In emotional speech synthesis, a large speech database is required for high-quality speech output. Voice conversion needs only a compact-sized speech database for each emotion. This study designs and accumulates a set of phonetically balanced small-sized emotional parallel speech databases to construct conversion functions. The Gaussian mixture bigram(More)
—Sleeping posture reveals important information for eldercare and patient care, especially for bed ridden patients. Traditionally, some works address the problem from either pressure sensor or video image. This paper presents a multimodal approach to sleeping posture classification. Features from pressure sensor map and video image have been proposed in(More)