Learn More
Models for predicting judgments about the quality of Spoken Dialog Systems have been used as overall evaluation metric or as optimization functions in adaptive systems. We describe a new approach to such models, using Hidden Markov Models (HMMs). The user's opinion is regarded as a continuous process evolving over time. We present the data collection method(More)
Proper usability evaluations of spoken dialogue systems are costly and cumbersome to carry out. In this paper, we present a new approach for facilitating usability evaluations which is based on user error simulations. The idea is to replace real users with simulations derived from empirical observations of users' erroneous behavior. The simulated errors(More)
Emotion plays an important role in human communication and therefore also human machine dialog systems can benefit from affective processing. We present in this paper an overview of our work from the past few years and discuss general considerations, potential applications and experiments that we did with the emotional classification of human machine(More)
Quality of Service (QoS) and Quality of Experience (QoE) have to be considered when designing, building and maintaining services involving multimodal human–machine interaction. In order to guide the assessment and evaluation of such services, we first develop a taxonomy of the most relevant QoS and QoE aspects which result from multimodal human–machine(More)
An experiment (N=24) was conducted with a spoken dialogue system (a smart home system), in which the users carried out several tasks with the system and rated its usability. Users' interactions were analyzed from the perspective of human error research done in human factors and cognitive ergonomics, distinguishing between goal-, concept-, task-, and(More)
This paper investigates the relationship between user ratings of multimodal systems and user ratings of its single modalities. Based on previous research showing precise predictions of ratings of multimodal systems based on ratings of single modality, it was hypothesized that the accuracy might have been caused by the participants' efforts to rate(More)