John W. McDonough

Learn More
In this work we formulate a novel approach to estimating the parameters of continuous density HMMs for speaker-independent (SI) continuous speech recognition. It is motivated by the fact that variability in SI acoustic models is attributed to both phonetic variation and variation among the speakers of the training population, that is independent of the(More)
In the early 1990s, the availability of the TIMIT read-speech phonetically transcribed corpus led to work at AT&T on the automatic inference of pronunciation variation. This work, brie ̄y summarized here, used stochastic decision trees trained on phonetic and linguistic features, and was applied to the DARPA North American Business News readspeech ASR task.(More)
This paper describes the speaker adaptive training (SAT) approach for speaker independent (SI) speech recognizers as a method for joint speaker normalization and estimation of the parameters of the SI acoustic models. In SAT, speaker characteristics are modeled explicitly as linear transformations of the SI acoustic parameters. The effect of inter-speaker(More)
In recent work, we proposed the rational all-pass transform (RAPT) as the basis of a speaker adaptation scheme intended for use with a large vocabulary speech recognition system. It was shown that RAPT-based adaptation reduces to a linear transformation of cepstral means, much like the better known maximum likelihood linear regression (MLLR). In a set of(More)
An 835 base pair (bp) fragment of mitochondrial DNA (mtDNA) was sequenced to characterize genetic variation within and among 1,053 samples comprising five regional populations each of longtail macaques (Macaca fascicularis) and rhesus macaques (Macaca mulatta), and one sample each of Japanese (M. fuscata) and Taiwanese (M. cyclopis) macaques. The mtDNA(More)
DNA was extracted from the buffy coats or serum of 212 rhesus macaques (Macaca mulatta) sampled throughout the species' geographic range. An 835 base pair (bp) fragment of mitochondrial DNA (mtDNA) was amplified from each sample, sequenced, aligned, and used to estimate genetic distances from which phylogenetic trees were constructed. A tree that included(More)
The PASCAL Speech Separation Challenge (SSC) is based on a corpus of sentences from the Wall Street Journal task read by two speakers simultaneously and captured with two circular eight-channel microphone arrays. This work describes our system for the recognition of such simultaneous speech. Our system has four principal components: A person tracker returns(More)
In this work, we propose an algorithm for acoustic source localization based on time delay of arrival (TDOA) estimation. In earlier work by other authors, an initial closed-form approximation was first used to estimate the true position of the speaker followed by a Kalman filtering stage to smooth the time series of estimates. In the proposed algorithm,(More)
Accurately modelling pronunciation variability in conversational speech is an important component of an automatic speech recognition system. We describe some of the projects undertaken in this direction during and after WS97, the Fifth LVCSR Summer Workshop, held at Johns Hopkins University, Baltimore, in JulyAugust, 1997. We first illustrate a use of(More)