Mattias Heldner

Learn More
This paper explores durational aspects of pauses, gaps and overlaps in three different conversational corpora with a view to challenge claims about precision timing in turn-taking. Distributions of pause, gap and overlap durations in conversations are presented, and methodological issues regarding the statistical treatment of such distributions are(More)
The evaluation and development of spoken dialogue systems is a complex undertaking, and much effort is expended on making it manageable. Research and industry endeavours in the area often seek to compare versions of existing systems, or to compare component technologies, in order to find the best methods – where “best” is defined as most efficient.(More)
This paper describes a recently introduced vector-valued representation of fundamental frequency variation – the FFV spectrum – which has a number of desirable properties. In particular, it is instantaneous, continuous, distributed, and well-suited to application of standard acoustic modeling techniques. We show what the representation looks like, and how(More)
Detection thresholds for gaps and overlaps, that is acoustic and perceived silences and stretches of overlapping speech in speaker changes, were determined. Subliminal gaps and overlaps were categorized as no-gap-no-overlaps. The established gap and overlap detection thresholds both corresponded to the duration of a long vowel, or about 120 ms. These(More)
This paper investigates prosodic aspects of turn-taking in conversation with a view to improving the efficiency of identifying relevant places at which a machine can legitimately begin to talk to a human interlocutor. It examines the relationship between interaction control, the communicative function of which is to regulate the flow of information between(More)
State-of-the-art speech recognizers are trained on predominantly normal speech and have difficulties handling either exceedingly slow and hyperarticulated or fast and sloppy speech. Explicitly instructing users on how to speak, however, can make the human–computer interaction stilted and unnatural. If it is possible to affect users’ speaking rate while(More)
Dynamic modeling of spoken dialogue seeks to capture how interlocutors change their speech over the course of a conversation. Much work has focused on how speakers adapt or entrain to different aspects of one another’s speaking style. In this paper we focus on local aspects of this adaptation. We investigate the relationship between backchannels and the(More)