Yau-Tarng Juang

Learn More
Most speaker recognition systems utilize only low-level short-term spectral features and ignore high-level long-term information, such as prosody and speaking style. This paper presents a novel eigen-prosody analysis (EPA) approach to capture long-term prosodic information of a speaker for robust speaker recognition under mismatch environment. It converts(More)
Unseen handset mismatch and limited training/test data are the major source of performance degradation for speaker identification in telecommunication environment. In this paper, a vector quantization (VQ)-based prosody modeling and an eigen-prosody analysis (EPA) is integrated to transform the close-set speaker identification problem into a full text(More)
In this investigation, two probabilistic latent semantic analyses (PLSA)-based approaches are proposed for use in speaker verification systems to reduce the number of parameters required by prosodic speaker models to (1) estimate reliably speakers' bi-gram models and to (2) reduce the amount of required training and test data. The basic concept is to (1)(More)