Learn More
We report on work on speaker diarization of telephone conversations which was begun at the Robust Speaker Recognition Workshop held at Johns Hopkins University in 2008. Three diarization systems were developed and experiments were conducted using the summed-channel telephone data from the 2008 NIST speaker recognition evaluation. The systems are a Baseline(More)
This paper presents a stream-based approach for unsupervised multi-speaker conversational speech segmentation. The main idea of this work is to exploit prior knowledge about the speaker space to find a low dimensional vector of speaker factors that summarize the salient speaker characteristics. This new approach produces segmentation error rates that are(More)
In this paper, we describe recent progress in i-vector based speaker verification. The use of universal background models (UBM) with full-covariance matrices is suggested and thoroughly experimentally tested. The i-vectors are scored using a simple cosine distance and advanced techniques such as Probabilistic Linear Discriminant Analysis (PLDA) and(More)
This article presents several techniques to combine between Support vector machines (SVM) and Joint Factor Analysis (JFA) model for speaker verification. In this combination, the SVMs are applied to different sources of information produced by the JFA. These informations are the Gaussian Mixture Model supervectors and speakers and Common factors. We found(More)
Porto, the institutional repository of the Politecnico di Torino, is provided by the University Library and the IT-Services. The aim is to enable open access to all the world. Please share with us how this access benefits you. Your story matters. Abstract Gaussian Mixture Models (GMMs) in combination with Support Vector Machine (SVM) classifiers have been(More)
The work presented in this paper is an extension of our two previous works [1, 2]. In the first paper [1], we proposed a low dimensional feature (i-vectors) extractor which is suitable for both telephone and microphone data of the NIST speaker recognition evaluation dataset. The second paper [2] introduces the use of Probabilistic Linear Discriminant(More)
This work presents two contributions to language identification. The first contribution is the definition of a set of properly selected time-frequency features that are a valid alternative to the commonly used shifted delta cepstral features. As a second contribution, we show that significant performance improvement in language recognition can be obtained(More)
The variability of the channel and environment is one of the most important factors affecting the performance of text-independent speaker verification systems. The best techniques for channel compensation are model based. Most of them have been proposed for Gaussian mixture models, while in the feature domain blind channel compensation is usually performed.(More)