Learn More
Most speaker recognition systems utilize only low-level short-term spectral features and ignore high-level long-term information, such as prosody and speaking style. This paper presents a novel eigen-prosody analysis (EPA) approach to capture long-term prosodic information of a speaker for robust speaker recognition under mismatch environment. It converts(More)
Unseen handset mismatch and limited training/test data are the major source of performance degradation for speaker identification in telecommunication environment. In this paper, a vector quantization (VQ)-based prosody modeling and an eigen-prosody analysis (EPA) is integrated to transform the close-set speaker identification problem into a full text(More)
In this investigation, two probabilistic latent semantic analyses (PLSA)-based approaches are proposed for use in speaker verification systems to reduce the number of parameters required by prosodic speaker models to (1) estimate reliably speakers' bi-gram models and to (2) reduce the amount of required training and test data. The basic concept is to (1)(More)
Handsets that are not seen in the training phase (unseen handsets) are significant sources of performance degradation for speaker identification (SID) applications in the telecommunication environment. In this paper, a novel latent prosody analysis (LPA) approach to automatically extract the most discriminative prosodic cues for assisting in conventional(More)