Attentive listening system with backchanneling, response generation and flexible turn-taking

  title={Attentive listening system with backchanneling, response generation and flexible turn-taking},
  author={Divesh Lala and Pierrick Milhorat and Koji Inoue and Masanari Ishida and Katsuya Takanashi and Tatsuya Kawahara},
  booktitle={SIGDIAL Conference},
Attentive listening systems are designed to let people, especially senior people, keep talking to maintain communication ability and mental health. This paper addresses key components of an attentive listening system which encourages users to talk smoothly. First, we introduce continuous prediction of end-of-utterances and generation of backchannels, rather than generating backchannels after end-point detection of utterances. This improves subjective evaluations of backchannels. Second, we… 

An Attentive Listening System with Android ERICA: Comparison of Autonomous and WOZ Interactions

An attentive listening system for the autonomous android robot ERICA is described and it is found that there is still a gap between the system and the WOZ for more sophisticated skills such as dialogue understanding, showing interest, and empathy towards the user.

Construction of Responsive Utterance Corpus for Attentive Listening Response Production

In Japan, the number of single-person households, particularly among the elderly, is increasing. Consequently, opportunities for people to narrate are being reduced. To address this issue,

Duplex Conversation: Towards Human-like Interaction in Spoken Dialogue Systems

The concept of full-duplex in telecommunication is used to demonstrate what a human-like interactive experience should be and how to achieve smooth turn-taking through three subtasks: user state detection, backchannel selection, and barge-in detection.

Japanese Dialogue Corpus of Information Navigation and Attentive Listening Annotated with Extended ISO-24617-2 Dialogue Act Tags

This paper tries to build a corpus that covers a wider range of dialogue tasks than existing task-oriented systems or text-chat systems, by transcribing face-to-face dialogues held in natural conversational situations in tasks of information navigation and attentive listening.

A speech-driven embodied entrainment character system with a delayed voice back-channel based on negative emotional expression utterances

The prior research includes the development of a speech-driven embodied entrainment computer-generated character called ”InterActor”, which automatically generates communicative motions and actions

Generating Fillers Based on Dialog Act Pairs for Smooth Turn-Taking by Humanoid Robot

This study presents a method to generate fillers at the beginning of the system utterances to indicate an intention of turn-taking or turn-holding just like human conversations in spoken dialog systems for humanoid robots.

Response Generation to Out-of-Database Questions for Example-Based Dialogue Systems

A sequence-to-sequence model is proposed that directly generates an appropriate response frame from an input question sentence in an end- to-end manner and explicitly integrates a question type classification to take into account the question type of the out-of-database question.

Neural Generation of Dialogue Response Timings

It is shown that human listeners consider certain response timings to be more natural based on the dialogue context, and the introduction of these models into SDS pipelines could increase the perceived naturalness of interactions.

Analysis of conversational listening skills toward agent-based social skills training

The authors' automated social skills training is extended by considering user listening skills during conversations with computer agents, and the number of noddings and backchannels within the utterances contributes to the predictions.

It’s About Time: Turn-Entry Timing For Situated Human-Robot Dialogue

This paper introduces a computational framework based on work from Psycholinguistics, which allows a situated dialogue system to start its turn and initiate actions earlier than would otherwise be possible and is a step toward more natural, human-like turn-taking behavior.



Toward Adaptive Generation of Backchannels for Attentive Listening Agents

By analyzing counseling dialogue, correlation patterns according to the type of backchannels and prosodic features are found; a larger correlation is observed for reactive tokens than acknowledging tokens and for the power features than the pitch features.

Making Turn-Taking Decisions for an Active Listening Robot for Memory Training

A dialogue system and response model that allows a robot to act as an active listener, encouraging users to tell the robot about their travel memories and resulted in dialogues with significantly fewer mistakes, a larger proportion of user speech and fewer interruptions is presented.

Importance-Driven Turn-Bidding for Spoken Dialogue Systems

It is found that Importance-Driven Turn-Bidding performs better than two current turn-taking approaches in an artificial collaborative slot-filling domain and supports the improvement of mixed-initiative interaction.

Building Autonomous Sensitive Artificial Listeners

A fully autonomous integrated real-time system which combines incremental analysis of user behavior, dialogue management, and synthesis of speaker and listener behavior of a SAL character displayed as a virtual agent is described.

Predicting Listener Backchannels: A Probabilistic Multimodal Approach

This paper shows how sequential probabilistic models can automatically learn from a database of human-to-human interactions to predict listener backchannels using the speaker multimodal output features (e.g., prosody, spoken words and eye gaze).

Conversational system for information navigation based on POMDP with user focus tracking

Using uh and um in spontaneous speaking

Analysis of prosodic and linguistic cues of phrase finals for turn-taking and dialog acts

This paper presents an analysis on the functions carried by phrase final tones in turn-taking and dialog acts, taking into account linguistic information about the part of speech attributed to the morphemes at phrase finals, while no clear relationship is found in some classes of morpheme which are final particles.