Ashvin Kannan

Learn More
This paper describes a general formalism for integrating t w o or more speech recognition technologies, which could be developed at different research sites using different recognition strategies. In this formalism, one system uses the N-best search strategy to generate a list of candidate sentences; the list is rescorred by other systems; and the different(More)
An algorithm for estimation of the parameters of a multiscale stochastic process based on scale-recursive dynamics on trees is presented. The expectation-maximization algorithm is used to provide maximum likelihood estimates for the general case of a nonhomogeneous tree with no fixed structure for the process dynamics. Experimental results are presented(More)
Segment models are a generalization of HMMs that can represent feature dynamics and/or correlation in time. In this work we develop the theory of Bayesian and maximum-likelihood adaptation for a segment model characterized by a polynomial mean trajectory. We show how adaptation parameters can be shared and adaptation detail can be controlled at run-time(More)
This paper summarizes the work of the “Rapid Speech Recognizer Adaptation” team in the workshop held at Johns Hopkins University in the summer of 1998. The project addressed the modeling of dependencies between units of speech with the goal of making more effective use of small amounts of data for speaker adaptation. A variety of methods were investigated(More)
To adapt the large number of parameters in a speech recognition acoustic model with a small amount of data, some notion of parameter dependence is needed. We present a dependence model to relate parameters in a parsimonious framework using a Gaussian multiscale process de ned by the evolution of a linear stochastic dynamical system on a tree. To adapt all(More)
This paper presents an overview of the Boston University continuous word recognition system, which is based on the Stochastic Segment Model (SSM). The key components of the system described here include: a segment-based acoustic model that uses a family of Gaussian distributions to characterize variable length segments; a divisive clustering technique for(More)
This paper presents a mechanism for implementing mixtures at a phone-subsegment (microsegment) level for continuous word recognition based on the Stochastic Segment Model (SSM). We investigate the issues that are involved in trade-oos between trajectory and mixture modeling in segment-based word recognition. Experimental results are reported on DARPA's(More)