Srinivasan Umesh

Learn More
In this paper, we show that frequency-warping (including VTLN) can be implemented through linear transformation of conventional MFCC. Unlike the Pitz-Ney [1] continuous domain approach, we directly determine the relation between frequency-warping and the linear-transformation in the discrete-domain. The advantage of such an approach is that it can be(More)
In this paper, we develop a computationally efficient approach for warp factor estimation in Vocal Tract Length Normalization (VTLN). Recently we have shown that warped features can be obtained by a linear transformation of the unwarped features. Using the warp matrices we show that warp factor estimation can be efficiently performed in an EM framework.(More)
We present experimental results that show better speaker nonnalization using our previously reported frequency warping function that is derived purely from speech data. In our previous work, we have numerically computed the frequency warping function for non-uniform scaling, which is similar to mel-scale, such that spectral envelopes from different speakers(More)
In this paper, we present a linear transformation (LT) to obtain warped features from unwarped features during vocal-tract length normalisation (VTLN). This LT between the warped and unwarped features is obtained within the conventional MFCC framework without any modification in the signal processing steps involved during the feature extraction stage.(More)
We present experimental results that show that the scale-factor relating the formant frequencies of different speakers increases with decreasing values of formant frequency. Based on these results, we experimentally obtain a frequency warping function aimed at separating speaker dependencies from the inherent characterization of the sound. We find that the(More)