Learn More
Recently, i-vector extraction and Probabilistic Linear Discriminant Analysis (PLDA) have proven to provide state-of-the-art speaker verification performance. In this paper, the speaker verification score for a pair of i-vectors representing a trial is computed with a functional form derived from the successful PLDA generative model. In our case, however,(More)
1. Abstract Most state–of–the–art speaker recognition systems are based on Gaussian Mixture Models (GMMs), where a speech segment is represented by a compact representation, referred to as " identity vector " (ivector for short), extracted by means of Factor Analysis. The main advantage of this representation is that the problem of intersession variability(More)
Phonotactic models based on bags of n-grams representations and discriminative classifiers are a popular approach to the language recognition problem. However, the large size of n-gram count vectors brings about some difficulties in discriminative classifiers. The subspace Multinomial model was recently proposed to effectively represent information(More)
Porto, the institutional repository of the Politecnico di Torino, is provided by the University Library and the IT-Services. The aim is to enable open access to all the world. Please share with us how this access benefits you. Your story matters. Abstract Most of the state–of–the–art speaker recognition systems use a compact representation of spoken(More)
This work presents a new approach to discriminative speaker verification. Rather than estimating speaker models, or a model that discriminates between a speaker class and the class of all the other speakers, we directly solve the problem of classifying pairs of utterances as belonging to the same speaker or not.
The i-vector extraction process is affected by several factors such as the noise level, the acoustic content of the observed features, and the duration of the analyzed speech segment. These factors influence both the i-vector estimate and its uncertainty, represented by the i-vector posterior covariance. This paper present a new PLDA model that, unlike the(More)
This paper contains a description of data, systems and fusions developed by the joint team of Brno University of Technology (BUT), Politecnico di Torino (PoliTo) and AGNITIO for the NIST 2011 Language Recognition Evaluation. The primary submission was a fusion of one acoustic and three phonotactic systems, with extensive use of sub-space projections for(More)
This paper describes the speaker identification (SID) system developed by the Patrol team for the first phase of the DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state of the art detection capabilities on audio from highly degraded communication channels. We present results using multiple SID systems differing mainly(More)
Several applications in Computer Vision, like recognition, identification, automatic 3D modeling and animation and non conventional human computer interaction require the precise identification of landmark points in facial images. Here we present a fast and robust algorithm capable of identifying a specific set of landmarks on face profile images. First,(More)