• Corpus ID: 226299824

Spoken Language Interaction with Robots: Research Issues and Recommendations, Report from the NSF Future Directions Workshop

  title={Spoken Language Interaction with Robots: Research Issues and Recommendations, Report from the NSF Future Directions Workshop},
  author={Matthew Marge and Carol Y. Espy-Wilson and Nigel G. Ward},
With robotics rapidly advancing, more effective human-robot interaction is increasingly needed to realize the full potential of robots for society. While spoken language must be part of the solution, our ability to provide spoken language interaction capabilities is still very limited. The National Science Foundation accordingly convened a workshop, bringing together speech, language, and robotics researchers to discuss what needs to be done. The result is this report, in which we identify key… 

Figures from this paper

Spoken language interaction with robots: Recommendations for future research
The State of SLIVAR: What's next for robots, human-robot interaction, and (spoken) dialogue systems?
This work synthesizes the reported results and recommendations of recent workshops and seminars that convened to discuss open questions within the important intersection of robotics, human-robot interaction, and spoken dialogue systems research to enable people to more effectively and naturally communicate with robots.
Core Challenges in Embodied Vision-Language Planning
This paper proposes a taxonomy to unify Embodied Vision-Language Planning tasks, a family of prominent embodied navigation and manipulation problems that jointly use computer vision and natural language, and presents the core challenges that new EVLP works should seek to address and advocates for task construction that enables model generalizability and furthers real-world deployment.
Individual Interaction Styles: Evidence from a Spoken Chat Corpus
While most individuals exhibited interaction style tendencies, these were generally far from stable, with a predictive model based on individual tendencies outperforming a speaker-independent model by only 3.6%.
Characterizing Spoken Dialog Corpora using Interaction Style Dimensions
The model is presented, its potential utility is illustrated by applying it to subsets of the Switchboard corpus, and some next steps are outlined.
A Dimensional Model of Interaction Style Variation in Spoken Dialog
A dimensional model of the space of interaction styles, derived from a large data set and prosody-based features is presented, which may be useful for selecting data for dialog model pretraining and fine-tuning, for investigating demographic differences, and for dialog system style adaptation.
Plan Explanations that Exploit a Cognitive Spatial Model
A robot control system whose cognitive world model is based on spatial affordances that generalize over its perceptual data can provide readily understandable natural language about the robot’s intentions and confidence, and generate diverse, contrastive explanations that reference the acquired spatial model.
Two Pragmatic Functions of Breathy Voice in American English Conversation
Although the paralinguistic and phonological significance of breathy voice is well known, its pragmatic roles have been little studied. We report a systematic exploration of the pragmatic functions


From Talking and Listening Robots to Intelligent Communicative Machines
It is a popular view that the future will be inhabited by intelligent talking and listening robots with whom we shall converse using the full palette of linguistic expression available to us as human
Collaborative Effort towards Common Ground in Situated Human-Robot Dialogue
An empirical study is presented that examines the role of the robot’s collaborative effort and the performance of natural language processing modules in dialogue grounding and indicates that in situated human-robot dialogue, a low collaborative effort from the robot may lead its human partner to believe a common ground is established.
Jointly Improving Parsing and Perception for Natural Language Commands through Human-Robot Dialog
Methods for using human-robot dialog to improve language understanding for a mobile robot agent that parses natural language to underlying semantic meanings and uses robotic sensors to create multi-modal models of perceptual concepts like red and heavy are presented.
Miscommunication Detection and Recovery in Situated Human–Robot Dialogue
A novel approach to detecting and recovering from miscommunication in dialogue by including situated context, namely, information from a robot’s path planner and surroundings is introduced.
A Research Platform for Multi-Robot Dialogue with Humans
This flexible language and robotic platform takes advantage of existing tools for speech recognition and dialogue management that are compatible with new domains, and implements an inter-agent communication protocol (tactical behavior specification), where verbal instructions are encoded for tasks assigned to the appropriate robot.
Learning to Mediate Perceptual Differences in Situated Human-Robot Dialogue
The empirical evaluation has shown that this weight-learning approach can successfully adjust the weights to reflect the robot's perceptual limitations and can lead to a significant improvement for referential grounding in future dialogues.
Exploring Turn-taking Cues in Multi-party Human-robot Discussions about Objects
In this paper, we present a dialog system that was exhibited at the Swedish National Museum of Science and Technology. Two visitors at a time could play a collaborative card sorting game together
Is Spoken Language All-or-Nothing? Implications for Future Speech-Based Human-Machine Interaction
It is concluded that interactions between native and non-native speakers, or between adults and children, or even between humans and dogs, might provide critical inspiration for the design of future speech-based human-machine interaction.
Language to Action: Towards Interactive Task Learning with Physical Agents
A brief introduction to interactive task learning where humans can teach physical agents new tasks through natural language communication and action demonstration and highlights the importance of commonsense knowledge, particularly the very basic physical causality knowledge, in grounding language to perception and action.
rrSDS: Towards a Robot-ready Spoken Dialogue System
This paper expands upon the ReTiCo incremental framework by outlining the incremental and multimodal modules and how their computation can be distributed, and demonstrates the power and flexibility of the robot-ready spoken dialogue system to be integrated with almost any robot.