Taufiq Hasan

Learn More
State-of-the-art Gaussian mixture model (GMM)-based speaker recognition/verification systems utilize a universal background model (UBM), which typically requires extensive resources, especially if multiple channel and microphone categories are considered. In this study, a systematic analysis of speaker verification system performance is considered for which(More)
Factor analysis based channel mismatch compensation methods for speaker recognition are based on the assumption that speaker/utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to reside in a lower dimensional subspace. This approach does not consider the fact that conventional acoustic feature vectors also reside in a(More)
Speaker recognition systems trained on long duration utterances are known to perform significantly worse when short test segments are encountered. To address this mismatch, we analyze the effect of duration variability on phoneme distributions of speech utterances and i-vector length. We demonstrate that, as utterance duration is decreased, number of(More)
In this study we evaluate the effectiveness of our recently introduced Mean Hilbert Envelope Coefficients (MHEC) in i-vector speaker verification using heavy-tailed probabilistic linear discriminant analysis (HT-PLDA) as the compensation/backend framework. The i-vectors are estimated for MHECs, and also the conventional and widely used MFCCs for comparison.(More)
This study explores various back-end classifiers for robust speaker recognition in multi-session enrollment, with emphasis on optimal utilization and organization of speaker information present in the development data. Our objective is to construct a highly discriminative back-end framework by fusing several back-ends on an i-vector system framework. It is(More)
The recently introduced mean Hilbert envelope coefficients (MHEC) have been shown to be an effective alternative to MFCCs for robust speaker identification under noisy and reverberant conditions in relatively small tasks. In this study, we investigate the effectiveness of these acoustic features in the context of a state-of-the-art speaker recognition(More)
Recent speaker recognition/verification systems generally utilize an utterance dependent fixed dimensional vector as features to Bayesian classifiers. These vectors, known as i-Vectors, are lower dimensional representations of Gaussian Mixture Model (GMM) mean super-vectors adapted from a Universal Background Model (UBM) using speech utterance features, and(More)
Robustness due to mismatched train/test conditions is one of the biggest challenges facing speaker recognition today, with transmission channel/handset and additive noise distortion being the most prominent factors. One limitation of the recent speaker recognition systems is that they are based on a latent factor analysis modeling of the GMM mean(More)
Physical task stress affects the acoustic speech wave in various ways. Motivated by observations that fundamental frequency and open quotient are affected by physical task stress, this study examines the effects of physical task stress on a set of glottal features. It is shown that a set of six glottal features can be used for physical stress detection,(More)
Identifying a person by his or her voice is an important human trait most take for granted in natural human-to-human interaction/communication. Speaking to someone over the telephone usually begins by identifying who is speaking and, at least in cases of familiar speakers, a subjective verification by the listener that the identity is correct and the(More)