Lutz Welling

Learn More
In this paper we investigate Linear Discriminant Analysis (LDA) for the TI connected digit recognition task (TI task) and the Wall Street Journal large vocabulary recognition task (WSJ task). In addition to previous variants of LDA implementations, we avoided the explicit incorporation of derivatives in the acoustic vector. Instead a sliding window without(More)
— This paper presents a new method for estimating formant frequencies. The formant model is based on a digital resonator. Each resonator represents a segment of the short-time power spectrum. The complete spectrum is modeled by a set of digital resonators connected in parallel. An algorithm based on dynamic programming produces both the model parameters and(More)
In this paper, we present an overview of the RWTH Aachen large vocabulary continuous speech recognizer. The recog-nizer is based on continuous density hidden Markov models and a time-synchronous left-to-right beam search strategy. Experimental results on the ARPA Wall Street Journal (WSJ) corpus verify the effects of several system components , namely(More)
In this paper we describe the optimization of 'conven-tional' template matching techniques for connected digit recognition (TI/NIST connected digit corpus). In particular we carried out a series of experiments in which we studied various aspects of signal processing, acoustic modeling, mixture densities and linear transforms of the acoustic vector. After(More)
In this work we compare two parameter optimization techniques for discriminative training using the MMI criterion: the extended Baum-Welch (EBW) algorithm and the generalized probabilistic descent (GPD) method. Using Gaussian emission densities we found special expressions for the step sizes in GPD, leading to reestimation formula very similar to those(More)
In this paper we describe experiments with the acoustic front{end of our large vocabulary speech recognition system. In particular, two aspects are studied: 1) linear transforms for feature extraction and 2) the modelling of the emission probabilities. Experiments are reported on a 5000{word task of the ARPA Wall Street Journal database. For the linear(More)
This paper gives an overview of an architecture and search organization for large vocabulary, continuous speech recognition (LVCSR at RWTH). In the rst part of the paper, we describe the principle and architecture of a LVCSR system. In particular, the issues of modeling and search for phoneme based recognition are discussed. In the second part, we review(More)