Learn More
State-of-the-art Gaussian mixture model (GMM)-based speaker recognition/verification systems utilize a universal background model (UBM), which typically requires extensive resources, especially if multiple channel and microphone categories are considered. In this study, a systematic analysis of speaker verification system performance is considered for which(More)
Speaker recognition systems trained on long duration utterances are known to perform significantly worse when short test segments are encountered. To address this mismatch, we analyze the effect of duration variability on phoneme distributions of speech utterances and i-vector length. We demonstrate that, as utterance duration is decreased, number of(More)
The recently introduced mean Hilbert envelope coefficients (MHEC) have been shown to be an effective alternative to MFCCs for robust speaker identification under noisy and rever-berant conditions in relatively small tasks. In this study, we investigate the effectiveness of these acoustic features in the context of a state-of-the-art speaker recognition(More)
Robustness due to mismatched train/test conditions is one of the biggest challenges facing speaker recognition today, with transmission channel/handset and additive noise distortion being the most prominent factors. One limitation of the recent speaker recognition systems is that they are based on a latent factor analysis modeling of the GMM mean(More)
This paper describes the systems developed by the Center for Robust Speech Systems (CRSS), for the 2012 National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE). Given that the emphasis of SRE'12 is on noisy and short duration test conditions, our system development focused on: (i) novel robust acoustic features, (ii) new(More)
Factor analysis based channel mismatch compensation methods for speaker recognition are based on the assumption that speaker/utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to reside in a lower dimensional subspace. This approach does not consider the fact that conventional acoustic feature vectors also reside in a(More)
This study explores various back-end classifiers for robust speaker recognition in multi-session enrollment, with emphasis on optimal utilization and organization of speaker information present in the development data. Our objective is to construct a highly discriminative back-end framework by fusing several back-ends on an i-vector system framework. It is(More)
This letter illustrates a novel and effective method for suppressing residual noise from enhanced speech signals as a second-stage post-filtering technique using empirical mode decomposition. The method significantly improves speech listening quality with simultaneous improvement of objective quality indices. The listening test results demonstrate the(More)
Recent speaker recognition/verification systems generally utilize an utterance dependent fixed dimensional vector as features to Bayesian classifiers. These vectors, known as i-Vectors, are lower dimensional representations of Gaussian Mixture Model (GMM) mean super-vectors adapted from a Universal Background Model (UBM) using speech utterance features, and(More)
In this paper we study automatic regularization techniques for the fusion of automatic speaker recognition systems. Parameter regularization could dramatically reduce the fusion training time. In addition, there will not be any need for splitting the development set into different folds for cross-validation. We utilize majorization-minimization approach to(More)