Rathinavelu Chengalvarayan

Learn More
We i n vestigate a class of features related to voicing parameters that indicate whether the vocal chords are vibrating. Features describing voicing characteristics of speech signals are integrated with an existing 38-dimensional feature vector consisting of rst and second order time derivatives of the frame energy and of the cepstral coeecients with their(More)
This paper addresses the problem of speech recognition under accent variations in English language. It has been demonstrated in previous research efforts that the multi-transitional model architecture is one of the solutions for robust speech recognition. In this study, we describe an universal hybrid system that is trained with data from American,(More)
The study presented in this work is a first effort at real-time speech translation of TED talks, a compendium of public talks with different speakers addressing a variety of topics. We address the goal of achieving a system that balances translation accuracy and la-tency. In order to improve ASR performance for our diverse data set, adaptation techniques(More)
— In this paper, we extend the maximum likelihood (ML) training algorithm to the minimum classification error (MCE) training algorithm for discriminatively estimating the state-dependent polynomial coefficients in the stochastic tra-jectory model or the trended hidden Markov model (HMM) originally proposed in [2]. The main motivation of this extension is(More)
— In this study, a new hidden Markov model that integrates generalized dynamic feature parameters into the model structure is developed and evaluated using maximum-likelihood (ML) and minimum-classification-error (MCE) pattern recognition approaches. In addition to the motivation of direct minimization of error rate, the MCE approach automatically(More)
AT&T has recently opened its extensive portfolio of state-of-the-art Speech Technology to external end-developers as a platform called " The AT&T Speech API ". This study discusses a series of practical challenges found in an industrial deployment of speech to text services, particularly, we examine different strategies for customizing the speech to text(More)