Carol Y. Espy-Wilson

Learn More
We present a method to boost the performance of probabilistic generative models that work with i-vector representations. The proposed approach deals with the nonGaussian behavior of i-vectors by performing a simple length normalization. This non-linear transformation allows the use of probabilistic models with Gaussian assumptions that yield equivalent(More)
In this paper we present a study on the automatic identification of acquisition devices when only access to the output speech recordings is possible. A statistical characterization of the frequency response of the device contextualized by the speech content is proposed. In particular, the intrinsic characteristics of the device are captured by a template,(More)
We present a multicondition training strategy for Gaussian Probabilistic Linear Discriminant Analysis (PLDA) modeling of i-vector representations of speech utterances. The proposed approach uses a multicondition set to train a collection of individual subsystems that are tuned to specific conditions. A final verification score is obtained by combining the(More)
Many different studies have claimed that articulatory information can be used to improve the performance of automatic speech recognition systems. Unfortunately, such articulatory information is not readily available in typical speaker-listener situations. Consequently, such information has to be estimated from the acoustic signal in a process which is(More)
We propose a method that combines acoustic-phonetic knowledge with support vector machines for segmentation of continuous speech into five classes vowel, sonorant consonant, fricative, stop and silence. We show that by using a probabilistic phonetic feature hierarchy, only four classifiers are required to recognize the five classes. Due to the probabilistic(More)
Mel-frequency cepstral coefficients (MFCC) have been dominantly used in speaker recognition as well as in speech recognition. However, based on theories in speech production, some speaker characteristics associated with the structure of the vocal tract, particularly the vocal tract length, are reflected more in the high frequency range of speech. This(More)
A probabilistic framework for a landmark-based approach to speech recognition is presented for obtaining multiple landmark sequences in continuous speech. The landmark detection module uses as input acoustic parameters (APs) that capture the acoustic correlates of some of the manner-based phonetic features. The landmarks include stop bursts, vowel onsets,(More)
The aim of this work was to propose Acoustic Parameters (APs) for the automatic detection of vowel nasalization based on prior knowledge of the acoustics of nasalized vowels. Nine automatically extractable APs were proposed to capture the most important acoustic correlates of vowel nasalization (extra pole-zero pairs, F1 amplitude reduction, F1 bandwidth(More)