Novel Speech Motion Generation by Modeling Dynamics of Human Speech Production

  title={Novel Speech Motion Generation by Modeling Dynamics of Human Speech Production},
  author={Kurima Sakai and Takashi Minato and Carlos Toshinori Ishi and Hiroshi Ishiguro},
  journal={Frontiers Robotics AI},
We have developed a method to automatically generate humanlike trunk motions based on speech (i.e., the neck and waist motions involved in speech) for a conversational android from its speech in real time. To generate humanlike movements, a mechanical limitation of the android (i.e., limited number of joint) needs to be compensated in order to express emotional states through motion. By enforcing the synchronization of speech and motion in the android, the method enables us to compensate for… 

Online processing for speech-driven gesture motion generation in android robots

An online processing for a speech-driven gesture motion generation in an android robot dialogue system is proposed and implemented and results indicated that gestures should not be delayed by more than 400ms relative to the speech utterances.

Modeling and evaluating beat gestures for social robots

A system to develop a natural talking gesture generation behavior that is able to learn natural gestures just by observation and improves the one developed with a simpler motion capture system is presented.

A Speech-Driven Hand Gesture Generation Method and Evaluation in Android Robots

This work analyzed a multimodal human–human dialogue data, and proposed a speech-driven gesture generation method by taking text, prosody, and dialogue act information into account, which was judged to be relatively natural even under the robot hardware constraints.

Similarity of the Impact of Humanoid and In-Person Communications on Frontal Brain Activity of Older People

The results imply that communicating through a humanoid robot induces a pattern of brain activity in older people that is potentially similar to in-person communication.

Older People Prefrontal Cortex Activation Estimates Their Perceived Difficulty of a Humanoid-Mediated Conversation

The model that estimates the older people's perceived difficulty by mapping their prefrontal cortex (PFC) activity during the verbal communication onto fine-grained cluster spaces of a working memory (WM) task that induces loads on human's PFC through modulation of its difficulty level is used.

Information Content of Prefrontal Cortex Activity Quantifies the Difficulty of Narrated Stories

This article introduces a novel approach to estimation of the individuals’ perceived difficulty of stories using the quantified information content of their prefrontal cortex activity and demonstrates the robustness of this approach by showing its comparable performance in face-to-face, humanoid, speaker, and video-chat settings.

Development of an Autonomous Android that can Naturally Talk with People

  • T. Minato
  • Psychology
    2018 World Symposium on Digital Intelligence for Systems and Machines (DISA)
  • 2018
The goal of this study is to develop an android robot that can talk with people in a humanlike manner and study human-android interaction in both of verbal and non-verbal aspects.

Differential Effect of the Physical Embodiment on the Prefrontal Cortex Activity as Quantified by Its Entropy

These findings argue for the significance of embodiment in naturalistic scenarios of social interaction, such as storytelling and verbal comprehension, and the potential application of brain information as a promising sensory gateway in the characterization of behavioural responses in human-robot interaction.

Decoding the Perceived Difficulty of Communicated Contents by Older People: Toward Conversational Robot-Assistive Elderly Care

A semi-supervised learning model based on mapping of the older people's prefrontal cortex activity during their verbal communication onto fine-grained cluster spaces of a working memory (WM) task allows for differential quantification of the observed changes in pattern of PFC activation during verbal communication with respect to the difficulty level of the WM task.

Alignment of the attitude of teleoperators with that of a semi-autonomous android

Studies on social robots that can communicate with humans are increasingly important. In particular, semi-aautonomous robots have shown potential for practical applications in which robot autonomy



Natural head motion synthesis driven by acoustic prosodic features

This paper presents a novel data‐driven approach to synthesize appropriate head motion by sampling from trained hidden markov models (HMMs) and shows that synthesized head motions follow the temporal dynamic behavior of real human subjects.

Speech-driven lip motion generation for tele-operated humanoid robots

In order to tele-operate the lip motion of a humanoid robot from the utterances of the operator, a speech-driven lip motion generation method is developed based on the rotation of the vowel space, given by the first and second formants, around the center vowel, and a mapping to the lip opening degrees.

Evaluation of formant-based lip motion generation in tele-operated humanoid robots

An improved version of the proposed speech-driven lip motion generation method, where lip height and width degrees are estimated based on vowel formant information, which indicates that the proposed audio-based method can generate lip motion with naturalness superior to vision-based and motion capture-based approaches.

Linking facial animation, head motion and speech acoustics

This paper focuses on the development of a system that takes speech acoustics as input, and gives as output the coefficients necessary to animate natural face and head motion.

Live Speech Driven Head-and-Eye Motion Generators

A fully automated framework to generate realistic head motion, eye gaze, and eyelid motion simultaneously based on live (or recorded) speech input and can significantly outperform the state-of-the-art head and eye motion generation algorithms.

Applying an analysis of acted vocal emotions to improve the simulation of synthetic speech

Analysis of Head Gesture and Prosody Patterns for Prosody-Driven Head-Gesture Animation

Object and subjective evaluations indicate that the proposed synthesis by analysis scheme provides natural looking head gestures for the speaker with any input test speech, as well as in ``prosody transplant" and ``gesture transplant" scenarios.

Corpus-based generation of head and eyebrow motion for an embodied conversational agent

This work presents a system that uses corpus-based selection strategies to specify the head and eyebrow motion of an animated talking head, and presents two different methods of selecting motions for the talking head based on the corpus data.

Online speech-driven head motion generating system and evaluation on a tele-operated robot

A tele-operated robot system where the head motions of the robot are controlled by combining those of the operator with the ones which are automatically generated from the operator's voice, based on dialogue act functions estimated from linguistic and prosodic information extracted from the speech signal.

A gesture-centric Android system for multi-party human-robot interaction

A system that can adjust gestures and facial expressions based on a speaker's location or situation for multi-party communication and gave humans a more sophisticated impression of the Actroid is developed.