Jayadev Billa

This paper describes the 1996 Byblos Callhome speech recognition system for Spanish and Egyptian Colloquial Arabic. The system uses a combination of Phoneticly Tied-Mixture Gaussian HMMs and State-Clustered Tied-Mixture Gaussian HMMs in a multiple pass decoder. We focus here on the aspects of the system which are language specific and demonstrate the(More)
This paper presents the 1997 BBN Byblos Large Vocabulary Speech Recognition (LVCSR) system. We give an outline of the algorithms and procedures used to train the system, describe the recognizer configuration and present the major technological innovations that lead to performance improvements. The major testbed we present our results for is the Switchboard(More)
Computational models of the peripheral auditory system have largely modeled basilar membrane (BM) mechanics as a linear filter-bank-like entity. Recent mathematical work on the nature of auditory system noise suppression allows us to analyze and argue for the incorporation of BM nonlinearities into these models. This analysis shows that vowel perception(More)
This paper explores techniques for utilizing untranscribed training data pools to increase the available training data for automatic speech recognition systems. It has been well established that current speech recognition technology, especially in Large Vocabulary Conversational Speech Recognition (LVCSR), is largely language independent, and that the(More)
This paper describes the improvements that resulted in the 1998 Byblos Large Vocabulary Conversational Speech Recognition (LVCSR) System. Salient among these improvements are: improved signal processing, improved Hidden Markov Model (HMM) topology, use of quinphone context, introduction of diagonal speaker adapted training (DSAT), incorporation of variance(More)
In this paper we propose a new approach to the modeling of speech based on cues from the peripheral auditory system. Our approach attempts to incorporate the dynamic adaptation of biological auditory systems to varying sound by simplistically formulating a dual-processing strategy that treats unvoiced and voiced speech as deserving of different processing.(More)
