Speaker Verification via Estimating Total Variability Space Using Probabilistic Partial Least Squares

@inproceedings{Chen2017SpeakerVV,
  title={Speaker Verification via Estimating Total Variability Space Using Probabilistic Partial Least Squares},
  author={C. Chen and Jiqing Han and Yilin Pan},
  booktitle={INTERSPEECH},
  year={2017}
}
The i-vector framework is one of the most popular methods in speaker verification, and estimating a total variability space (TVS) is a key part in the i-vector framework. Current estimation methods pay less attention on the discrimination of TVS, but the discrimination is so important that it will influence the improvement of performance. So we focus on the discrimination of TVS to achieve a better performance. In this paper, a discriminative estimating method of TVS based on probabilistic… 

Figures and Tables from this paper

Task-Driven Variability Model for Speaker Verification
TLDR
A task-driven variability model (TDVM) to jointly estimate the TVM and PLDA classifier and results show that the TDVM method can achieve better performance than the traditional TVM/PLDA and VGG-M network with different cost functions.
Supervector Compression Strategies to Speed up I-Vector System Development
TLDR
The results suggest that, in terms of ASV accuracy, the supervector compression approaches are on a par with FEFA, and two supervised approaches, supervised PPCA (SPPCA) and the recently proposed probabilistic partial least squares (PPLS), to compress MAP-adapted GMM supervectors.
TDMF: Task-Driven Multilevel Framework for End-to-End Speaker Verification
  • Chen Chen, Jiqing Han
  • Computer Science
    ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
TLDR
The experimental results show that the TDMF can achieve better performance than that of the typical i-vector framework and VGG-M convolutional neural networks (CNN) framework.
My title
  • 2018

References

SHOWING 1-10 OF 21 REFERENCES
Maximum Likelihood i-vector Space Using PCA for Speaker Verification
TLDR
This paper proposes a new approach to training the i-vector space using a variant of PCA with the Baum-Welch statistics for speaker verification, and the results show that the performances in two total variability spaces are comparable.
A partial least squares framework for speaker recognition
TLDR
This work develops a method for modeling the variability associated with each class (speaker) by using partial-least-squares - a latent variable modeling technique, which isolates the most informative subspace for each speaker.
Local Variability Modeling for Text-Independent Speaker Verification
TLDR
The local variability model (LVM) is proposed, the central idea of which is to capture the local variability associated with individual Gaussians in the acoustic space that are absent in the i-vector representation.
Front-End Factor Analysis For Speaker Verification
  • Florin Curelaru
  • Computer Science
    2018 International Conference on Communications (COMM)
  • 2018
TLDR
This paper investigates which configuration and which parameters lead to the best performance of an i-vectors/PLDA based speaker verification system and presents at the end some preliminary experiments in which the utterances comprised in the CSTR VCTK corpus were used besides utterances from MIT-MDSVC for training the total variability covariance matrix and the underlying PLDA matrices.
A Study of Interspeaker Variability in Speaker Verification
TLDR
It is shown that when a large joint factor analysis model is trained in this way and tested on the core condition, the extended data condition and the cross-channel condition, it is capable of performing at least as well as fusions of multiple systems of other types.
Minimax i-vector extractor for short duration speaker verification
TLDR
This study proposes to use a minimax strategy to estimate the sufficient statistics in order to increase the robustness of the extracted i-vectors and shows by experiments that the proposed minimax technique can improve over the baseline system from 9.89% to 7.99% on the NIST SRE 2010 8conv-10sec task.
Joint Factor Analysis Versus Eigenchannels in Speaker Recognition
TLDR
It is shown how the two approaches to the problem of session variability in Gaussian mixture model (GMM)-based speaker verification, eigenchannels, and joint factor analysis can be implemented using essentially the same software at all stages except for the enrollment of target speakers.
Distant Speaker Recognition: An Overview
TLDR
State-of-the-art techniques in DSR such as robust feature extraction, feature normalization, robust speaker modeling, model compensation, dereverberation and score normalization are discussed to overcome the speech degradation components i.e., reverberation and ambient noise.
Summary and initial results of the 2013-2014 speaker recognition i-vector machine learning challenge
TLDR
During late-2013 through early-2014 NIST coordinated a special i-vector challenge based on data used in previous NIST Speaker Recognition Evaluations, which initial results indicate the leading system achieved an approximate 37% improvement relative to the baseline system.
An overview of text-independent speaker recognition: From features to supervectors
...
1
2
3
...