Alexander Schmitt

Learn More
In this work we describe the modeling and prediction of Interaction Quality (IQ) in Spoken Dialogue Systems (SDS) using Support Vector Machines. The model can be employed to estimate the quality of the ongoing interaction at arbitrary points in a spoken human-computer interaction. We show that the use of 52 completely automatic features characterizing the(More)
The present paper is devoted to the study of vector bundles with an additional structure from a unified point of view. We have picked the name " decorated vector bundles " suggested in [19]. Before we outline our paper, let us give some background. The first problem to treat is the problem of classifying vector bundles over an algebraic curve X, assumed(More)
The present study elaborates on the exploitation of both linguistic and acoustic feature modeling for anger classification. In terms of acoustic modeling we generate statistics from acoustic audio descriptors, e.g. pitch, loudness, spectral characteristics. Ranking our features we see that loudness and MFCC seems most promising for all databases. For the(More)
Standardized corpora are the foundation for spoken language research. In this work, we introduce an annotated and standardized corpus in the Spoken Dialog Systems (SDS) domain. Data from the Let's Go Bus Information System from the Carnegie Mellon University in Pittsburgh has been formatted, parameterized and annotated with quality, emotion, and task(More)
Thede online prediction of task success in Interactive Voice Response (IVR) systems is a comparatively new field of research. It helps to identify problemantic calls and enables the dialog system to react before the caller gets overly frustrated. This publication investigates, to which extent it is possible to predict task completion and how existing(More)
Acoustic anger detection in voice portals can help to enhance human computer interaction. In this paper we report about the performance of selected acoustic features for anger classification. We evaluate the performance of the features on both a German and an Ameri-can English dialogue voice portal database which contain " real " speech, i.e. non-acted,(More)
Most studies on speech-based emotion recognition are based on prosodic and acoustic features, only employing artificial acted corpora where the results cannot be generalized to telephone-based speech applications. In contrast, we present an approach based on utterances from 1,911 calls from a deployed telephone-based speech application, taking advantage of(More)