Oldrich Plchot

Learn More
In this paper, we describe recent progress in i-vector based speaker verification. The use of universal background models (UBM) with full-covariance matrices is suggested and thoroughly experimentally tested. The i-vectors are scored using a simple cosine distance and advanced techniques such as Probabilistic Linear Discriminant Analysis (PLDA) and(More)
The concept of so called iVectors, where each utterance is represented by fixed-length low-dimensional feature vector, has recently become very successfully in speaker verification. In this work, we apply the same idea in the context of Language Recognition (LR). To recognize language in the iVector space, we experiment with three different linear(More)
Recently, i-vector extraction and Probabilistic Linear Discriminant Analysis (PLDA) have proven to provide state-of-the-art speaker verification performance. In this paper, the speaker verification score for a pair of i-vectors representing a trial is computed with a functional form derived from the successful PLDA generative model. In our case, however,(More)
This work studies the use of deep neural networks (DNNs) to address automatic language identification (LID). Motivated by their recent success in acoustic modelling, we adapt DNNs to the problem of identifying the language of a given spoken utterance from short-term acoustic features. The proposed approach is compared to state-of-the-art i-vector based(More)
This work presents a new and efficient approach to discriminative speaker verification in the i-vector space. We illustrate the development of a linear discriminative classifier that is trained to discriminate between the hypothesis that a pair of feature vectors in a trial belong to the same speaker or to different speakers. This approach is alternative to(More)
The i-vector extraction process is affected by several factors such as the noise level, the acoustic content of the observed features, and the duration of the analyzed speech segment. These factors influence both the i-vector estimate and its uncertainty, represented by the i-vector posterior covariance. This paper present a new PLDA model that, unlike the(More)
This paper describes a novel approach to phonotactic LID, where instead of using soft-counts based on phoneme lattices, we use posteriogram to obtain n-gram counts. The high-dimensional vectors of counts are reduced to low-dimensional units for which we adapted the commonly used term i-vectors. The reduction is based on multinomial subspace modeling and is(More)
This work studies the usage of the Deep Neural Network (DNN) Bottleneck (BN) features together with the traditional MFCC features in the task of i-vector-based speaker recognition. We decouple the sufficient statistics extraction by using separate GMM models for frame alignment, and for statistics normalization and we analyze the usage of BN and MFCC(More)
This paper addresses a novel technique for representation and processing of n-gram counts in phonotactic language recognition (LRE): subspace multinomial modelling represents the vectors of n-gram counts by low dimensional vectors of coordinates in total variability subspace, called iVector. Two techniques for iVector scoring are tested: support vector(More)
Phonotactic language recognition is one of major techniques used for automatic recognition of spoken languages. We propose a feature extraction technique based on PCA to be used with SVM-based systems. This technique improves speed of the training, in some cases more than 1000 times, allowing systems to be effectively trained on much larger data sets.(More)