Sarel van Vuuren

Learn More
We describe use of Linear Discriminant Analysis (LDA) for data-driven automatic design of RASTA-like lters. The LDA applied to rather long segments of time trajectories of critical-band energies yields FIR lters to be applied to these time trajectories in the feature extraction module. Frequency responses of the rst three discriminant vectors are in(More)
RASTA processing has proven to be a successful technique for channel normalization in automatic speech recognition (ASR). We present two approaches to the design of RASTA-like filters from training data. One consists of finding the solution to a constrained optimization problem on the feature time trajectories while the other uses Linear Discriminant(More)
This article presents a vision of the near future in which computer interaction is characterized by natural face-toface conversations with lifelike characters that speak, emote and gesture. These animated agents will converse with people much like people converse effectively with assistants in a variety of focused applications. Despite the research advances(More)
The mutual information concept is used to study the distribution of speech information in frequency and in time. The main focus is on the information that is relevant for phonetic classi cation. A large database of hand-labeled uent speech is used to (a) compute the mutual information (MI) between a phonetic classi cation variable and one spectral feature(More)
verification," in ICSLP, (Sydney, Australia), Nov. 1998. ON THE IMPORTANCE OF COMPONENTS OF THE MODULATION SPECTRUM FOR SPEAKER VERIFICATION Sarel van Vuuren1 Hynek Hermansky1;2 1 Department of Electrical and Computer Engineering, Oregon Graduate Institute of Science and Technology, PO Box 91000, Portland, OR 97291-1000, USA, 2 International Computer(More)
We compare speaker recognition performance of Vector Quantization (VQ), Gaussian Mixture Modeling (GMM) and the Arithmetic Harmonic Sphericity measure (AHS) in adverse telephone speech conditions. The aim is to address the question: how do multimodal VQ and GMM typically compare to the simpler unimodal AHS for matched and mismatched training and testing(More)
In this paper we present an algorithm that makes use of information contained in syllable lattices to significantly reduce the classification error rate of a children’s speech reading tracker. The task is to verify whether each word in a reference string was actually spoken. A syllable graph is generated from the reference word string to represent(More)
In this paper we describe DISCUSS, a dialogue move taxonomy layered over semantic representations. We designed this scheme to enable development of computational models of tutorial dialogues and to provide an intermediate representation suitable for question and tutorial act generation. As such, DISCUSS captures semantic and pragmatic elements across four(More)
This article describes My Science Tutor (MyST), an intelligent tutoring system designed to improve science learning by students in 3<sup>rd</sup>, 4<sup>th</sup>, and 5<sup>th</sup> grades (7 to 11 years old) through conversational dialogs with a virtual science tutor. In our study, individual students engage in spoken dialogs with the virtual tutor Marni(More)