Learn More
Sphinx-4 is a flexible, modular and pluggable framework to help foster new innovations in the core research of hidden Markov model (HMM) recognition systems. The design of Sphinx-4 is based on patterns that have emerged from the design of past systems as well as new requirements based on areas that researchers currently want to explore. To exercise this(More)
We present an algorithm for dereverberation of speech signals for automatic speech recognition (ASR) applications. Often ASR systems are presented with speech that has been recorded in environments that include noise and reverberation. The performance of ASR systems degrades with increasing levels of noise and reverberation. While many algorithms have been(More)
This paper investigates the use of higher-order autoregressive vector predictors for tracking the noise in noisy speech signals. The autoregressive predictors form the state equation of a linear dynamical system that models the spectral dynamics of the noise process. Experiments show that the use of such models to track noise can lead to large gains in(More)
Recent findings from studies of two families have shown that mutations in the GABA(A)-receptor gamma2 subunit are associated with generalized epilepsies and febrile seizures. Here we describe a family that has generalized epilepsy with febrile seizures plus (GEFS(+)), including an individual with severe myoclonic epilepsy of infancy, in whom a third(More)
The Carnegie Mellon Communicator is a telephone-based dialog system that supports planning in a travel domain. The implementation of such a system requires two complimentary components, an architecture capable of managing interaction and the task, as well as a knowledge base that captures the speech, language and task characteristics specific to the domain.(More)
In this paper we model noise as a sequence of states of a dynam-ical system with a continuum of states. Observations generated by such a system are assumed to be related to the state of the system by a functional relation which models clean speech as the corrupting influence on noise. We show how the closed-form representation of such a dynamical system can(More)
This paper proposes to use non-negative matrix factorization based speech enhancement in robust automatic recognition of mixtures of speech and music. We represent magnitude spectra of noisy speech signals as the non-negative weighted linear combination of speech and noise spectral basis vectors, that are obtained from training corpora of speech and music.(More)
Missing-feature methods improve automatic recognition of noisy speech by removing unreliable noise corrupted spec-trographic components from the signal. Recognition is performed either by modifying the recognizer to work from incomplete spectra, or by estimating the missing components to reconstruct complete spectra. While the former approach performs(More)
In the tandem approach to modeling the acoustic signal, a neural-net preprocessor is first discriminatively trained to estimate posterior probabilities across a phone set. These are then used as feature inputs for a conventional hidden Markov model (HMM) based speech recognizer, which relearns the associations to sub-word units. In this paper, we apply the(More)
—Large vocabulary continuous speech recognition (LVCSR) systems traditionally represent words in terms of smaller subword units. Both during training and during recognition, they require a mapping table, called the dictionary, which maps words into sequences of these subword units. The performance of the LVCSR system depends critically on the definition of(More)