Claudio Vair

Learn More
This paper presents a stream-based approach for unsupervised multi-speaker conversational speech segmentation. The main idea of this work is to exploit prior knowledge about the speaker space to find a low dimensional vector of speaker factors that summarize the salient speaker characteristics. This new approach produces segmentation error rates that are(More)
The variability of the channel and environment is one of the most important factors affecting the performance of text-independent speaker verification systems. The best techniques for channel compensation are model based. Most of them have been proposed for Gaussian mixture models, while in the feature domain blind channel compensation is usually performed.(More)
Gaussian Mixture Models (GMMs) in combination with Support Vector Machine (SVM) classifiers have been shown to give excellent classification accuracy in speaker recognition. In this work we use this approach for language identification, and we compare its performance with the standard approach based on GMMs. In the GMM-SVM framework, a GMM is trained for(More)
This paper describes the Loquendo – Politecnico di Torino system evaluated on the 2006 NIST speaker recognition evaluation dataset. This system was among the best participants in this evaluation. It combines the results of two independent GMM systems: a Phonetic GMM and a classical GMM. Both systems rely on an intersession variation compensation approach,(More)
This work presents two contributions to language identification. The first contribution is the definition of a set of properly selected time-frequency features that are a valid alternative to the commonly used shifted delta cepstral features. As a second contribution, we show that significant performance improvement in language recognition can be obtained(More)
This paper describes the experimental setup and the results obtained using several state-of-the-art speaker recognition classifiers. The comparison of the different approaches aims at the development of real world applications, taking into account memory and computational constraints, and possible mismatches with respect to the training environment. The(More)
In this paper, we present an integration of Data Driven Parallel Model Combination (DPMC) and Bayesian Learning into a fast and accurate framework which can be easily integrated in standard training and recognition systems. The original DPMC technique has been enhanced to avoid any modi cation of the acoustic models, as required by the original method. The(More)
This paper presents our approach to unsupervised multispeaker conversational speech segmentation. Speech segmentation is obtained in two steps that employ different techniques. The first step performs a preliminary segmentation of the conversation analyzing fixed length slices, and assumes the presence in every slice of one or two speakers. The second step(More)
This paper describes the improvements introduced in the Loquendo-Politecnico di Torino (LPT) speaker recognition system submitted to the NIST SRE08 evaluation campaign. This system, which was among the best participants in this evaluation, combines the results of three core acoustic systems, two based on Gaussian Mixture Models (GMMs), and one on Phonetic(More)