Learn More
—This paper presents an extension of our previous work which proposes a new speaker representation for speaker verification. In this modeling, a new low-dimensional speaker-and channel-dependent space is defined using a simple factor analysis. This space is named the total variability space because it models both speaker and channel variabilities. Two(More)
—We propose a new approach to the problem of estimating the hyperparameters which define the interspeaker variability model in joint factor analysis. We tested the proposed estimation technique on the NIST 2006 speaker recognition evaluation data and obtained 10%–15% reductions in error rates on the core condition and the extended data condition (as(More)
In this paper, we describe systems that were developed for the Open Performance Sub-Challenge of the INTERSPEECH 2009 Emotion Challenge. We participate in both two-class and five-class emotion detection. For the two-class problem, the best performance is obtained by logistic regression fusion of three systems. These systems use short-and long-term speech(More)
The aim of this paper is to compare different log-likelihood scoring methods, that different sites used in the latest state-of-the-art Joint Factor Analysis (JFA) Speaker Recognition systems. The algorithms use various assumptions and have been derived from various approximations of the objective functions of JFA. We compare the techniques in terms of speed(More)
—In this paper, we introduce the use of continuous prosodic features for speaker recognition and we show how they can be modeled using joint factor analysis. Similar features have been successfully used in language identification. These prosodic features are pitch and energy contours spanning a syllable-like unit. They are extracted using a basis consisting(More)
This article presents several techniques to combine between Support vector machines (SVM) and Joint Factor Analysis (JFA) model for speaker verification. In this combination, the SVMs are applied to different sources of information produced by the JFA. These infor-mations are the Gaussian Mixture Model supervectors and speakers and Common factors. We found(More)
In this paper, a new language identification system is presented based on the total variability approach previously developed in the field of speaker identification. Various techniques are employed to extract the most salient features in the lower dimensional i-vector space and the system developed results in excellent performance on the 2009 LRE evaluation(More)
(Top Left) A visualization of the first two principal components of the i-vectors in a three-speaker conversation. The rest of the plots show the result of VBEM-GMM clustering after a single iteration (top right), three iterations (bottom right), and the final results (bottom left). For more see " Unsupervised Methods for Speaker Diarization: An Integrated(More)
In this paper, we introduced the use of formants contours with prosodic contours based on pitch and energy for speaker recognition. These contours are modeled on continuous manners by using the Legendre polynomials on basic unit which represents syllables. The parameters extracted from the Legendre polyno-mials coefficients plus the syllables duration are(More)