Learn More
People in dialog use a rich set of nonverbal behaviors, including variations in the prosody of their utterances. Such behaviors, often emotion-related, call for appropriate responses, but today's spoken dialog systems lack the ability to do this. Recent work has shown how to recognize user emotions from prosody and how to express system-side emotions with(More)
The rules governing turn-taking phenomena are not well understood in general and almost completely undocumented for Arabic. As the first step to modeling these phenomena, we have collected a small corpus of Iraqi Arabic spoken dialogs. The corpus is in three parts. Part A is 110 minutes of unstructured conversations. Parts B1 and B2 are 176 minutes of(More)
—In spoken dialog, speakers are simultaneously engaged in various mental processes, and it seems likely that the word that will be said next depends, to some extent, on the states of these mental processes. Further, these states can be inferred, to some extent, from properties of the speaker's voice as they change from moment to moment. As a illustration of(More)
This technical report complements the main paper, Patterns of Importance Variation in Spoken Dialog [Ward and Richart-Ruiz, 2013], by providing additional evidence for the claims, additional findings, and more analysis. In particular, we report more on inter-annotator disagreement, on words that correlate with importance, on prosodic features and patterns(More)
Models of dialog state are important, both scientifically and practically, but today's best build strongly on tradition. This paper presents a new way to identify the important dimensions of dialog state, more bottom-up and empirical than previous approaches. Specifically, we applied Principal Component Analysis to a large number of low-level prosodic(More)
Today there are solutions for some specific turn-taking problems , but no general model. We show how turn-taking can be reduced to two more general problems, prediction and selection. We also discuss the value of predicting not only future speech/silence but also prosodic features, thereby handing not only turn-taking but " turn-shaping ". To illustrate how(More)
Discovering and quantifying the prosodic signals that help manage turn-taking is difficult, in part because of the limitations of commonly used methods. This paper presents an integrated method that uses both perceptually-based analysis and quantitative analysis. The eight activities involved in the method — clarification of aims, problem formulation,(More)
If we can model the cognitive and communicative processes underlying speech, we should be able to better predict what a speaker will do. With this idea as inspiration, we examine a number of prosodic and timing features as potential sources of information on what words the speaker is likely to say next. In spontaneous dialog we find that word probabilities(More)
Inspired by the goal of modeling the dialog state and the speaker's mental state, moment by moment, we apply Principal Component Analysis to a vector of 76 prosodic features spanning 6 seconds of context. This gives a mul-tidimensional representation of the current state. We find that word probabilities vary strongly with several of these dimensions, that(More)