Learn More
This article presents several techniques to combine between Support vector machines (SVM) and Joint Factor Analysis (JFA) model for speaker verification. In this combination, the SVMs are applied to different sources of information produced by the JFA. These infor-mations are the Gaussian Mixture Model supervectors and speakers and Common factors. We found(More)
In this paper, we describe recent progress in i-vector based speaker verification. The use of universal background models (UBM) with full-covariance matrices is suggested and thoroughly experimentally tested. The i-vectors are scored using a simple cosine distance and advanced techniques such as Probabilistic Linear Discriminant Analysis (PLDA) and(More)
This paper describes the improvements introduced in the Loquendo–Politecnico di Torino (LPT) speaker recognition system submitted to the NIST SRE08 evaluation campaign. This system, which was among the best participants in this evaluation, combines the results of three core acoustic systems, two based on Gaussian Mixture Models (GMMs), and one on Phonetic(More)
This work presents two contributions to language identification. The first contribution is the definition of a set of properly selected time-frequency features that are a valid alternative to the commonly used Shifted Delta Cepstral features. As a second contribution, we show that significant performance improvement in language recognition can be obtained(More)
The work presented in this paper is an extension of our two previous works [1, 2]. In the first paper [1], we proposed a low dimensional feature (i-vectors) extractor which is suitable for both telephone and microphone data of the NIST speaker recognition evaluation dataset. The second paper [2] introduces the use of Probabilistic Linear Discriminant(More)
of planned work Nowadays, speaker recognition is relatively mature with the basic scheme, where speaker model is trained using target speaker speech and speech from large number of non-target speakers. However, the speech from non-target speakers is typically used only for finding general speech distribution (e.g. UBM). It is not used to find the "(More)