Minghui Dong

Learn More
Stress has effect on speech characteristics and can influence the quality of speech. In this paper, we study the effect of Sleep-Deprivation (SD) on speech characteristics and classify Normal Speech (NS) and Sleep Deprived Speech (SDS). One of the indicators of sleep deprivation is 'flattened voice'. We examine pitch and harmonic locations to analyse(More)
This paper describes our recent efforts in exploring effective discriminative features for speaker recognition. Recent researches have indicated that the appropriate fusion of features is critical to improve the performance of speaker recognition system. In this paper we describe our approaches for the NIST 2006 Speaker Recognition Evaluation. Our system(More)
Natural pitch fluctuations are essential to human singing. To effectively synthesize singing voice, the generation of these pitch fluctuations is necessary. Previous synthesis methods classify and reproduce them individually. These fluctuations, however, are found to be dependent and vary under different contexts. This paper proposes a generalized framework(More)
The paper presents a unit selection-based speech synthesis approach for mandarin Chinese. Unit selection-based approach generates speech by selecting proper units from a speech corpus and connecting them together. In this approach, a set of features are defined to describe the speech units in the corpus and the expected units in the synthesized utterance.(More)
This paper describes an approach to generating prosody parameters for Mandarin Chinese text-to-speech system. The Chinese fundamental frequency contour is decomposed into two parts, a global intonation contour and a syllable level tone contour. The global intonation contour is converted to pitch target labels in corpus. It is predicted by first predicting(More)
This paper presents a sparse representation framework for weighted frequency warping based voice conversion. In this method, a frame-dependent warping function and the corresponding spectral residual vector are first calculated for each source-target spectrum pair. At runtime conversion, a source spectrum is factorised as a linear combination of a set of(More)