Modelling personality features by changing prosody in synthetic speech

  title={Modelling personality features by changing prosody in synthetic speech},
  author={J{\"u}rgen Trouvain and Sarah Schmidt and Marc Schr{\"o}der and Michael Schmitz and William J. Barry},
This study explores how features of brand personalities can b e modelled with the prosodic parameters pitch level, pitch range, articulation rate and loudness. Experiments with parametric al diphone synthesis showed that listeners rated the prosodically changed versions better than a baseline version for the dimen- sions "sincerity", "competence", "sophistication", "excitem ent" and "ruggedness". The contribution of prosodic features such as lower pitch and an enlarged pitch range are analyzed… 

Tables from this paper

Speech is a complex sign and has several layers that convey information. The extra linguistic layer of speech comprises acoustic cues that give the listener a hint about the speaker’s certain
The Impact of Rhythmic Distortions in Speech on Personality Assessment
Abstract The perennial question as to how perceived otherness in speech projects into listener assessment of one’s personality has been systematically investigated within the field of foreign
Speech Synthesis for the Generation of Artificial Personality
This work examines the impact text content, voice quality and synthesis system have on the perceived personality of two synthetic voices and finds that parametric voices were rated as significantly less neurotic than both the text alone and the unit selection system.
Personality prediction based on intonation stylization
Intonation patterns were extracted by a parametric superpositional stylization approach that allows for pattern description on aparametric as well as on a categorical level to predict speaker personality with respect to the four dimensions acting, extroversion, other-directedness and sensitivity.
Acoustic Correlates of Likable Speakers in the NSC Database
Speech stimuli from scenario-based conversations were analyzed regarding acoustic correlates of likability. Utterances from the pizza ordering scenario of the NSC corpus were selected, and the
The voice of personality: mapping nonverbal vocal behavior into trait attributions
The work shows how prosodic features can be used to predict, with an accuracy up to 75% depending on the trait, the personality assessments performed by human judges on a collection of 640 speech samples, based on a short version of the Big Five Inventory.
Pitch and Intonation Contribution to Speakers' Traits Classification
The article describes the system we submitted for the three sub-challenges of INTERSPEECH 2012 Speaker Trait Challenge for the classification of the five personality traits of OCEAN, likability and
The influence of voice pitch on the evaluation of a social robot receptionist
The results show that the high pitch robot was perceived significantly more attractive in terms of voice, behavior and personality and the increased level of the robot's attractiveness induced significantly better ratings on the overall enjoyment and overall interaction quality.
Automatic Chinese Personality Recognition Based on Prosodic Features
Experiments’ result shows combination of pitch, intensity, formants and speak rate as classification parameters can achieve higher classification accuracy than in single prosodic feature.
Likability of human voices: A feature analysis and a neural network regression approach to automatic likability estimation
This work investigates the automatic analysis of voice likability in a continuous label space with neural networks as regressors and discusses the relevance of acoustic features.


The studies reviewed in this paper are somewhat diverse. The one unifying feature in all of them is their purpose of identifying the ways in which non-content aspects of speech elicit personality
Expressing degree of activation in synthetic speech
  • M. Schröder
  • Psychology
    IEEE Transactions on Audio, Speech, and Language Processing
  • 2006
A set of emotional prosody rules were formulated and implemented in a German text-to-speech (TTS) system and a perception study investigated how well the resulting synthesized prosody fits with emotional states defined through textual situation descriptions.
Effects of Pitch and Speech Rate on Personal Attributions
In three experiments, subjects listened to recordings of male speakers answering two interview questions and rated the speakers on a variety of scales. The recordings had been altered so that the
Effects of Speech Rate on Personality Perception
Using the voices of six subjects, representing various social and educational backgrounds, fifty-four synthetic voices were generated by computer, and it was found that the competence factor was much more sensitive to rate manipulations than was the benevolence factor.
Prosodic cues for rated politeness in Japanese speech
The Prosody of Excitement in Horse Race Commentaries
This study investigates examples of horse race commentaries and compares the acoustic properties with an auditorily based description of the typical suspense pattern from calm to very excited at the
Using Prosodic and Voice Quality Features for Paralinguistic Information Extraction
The use of voice quality features in addition to prosodic features is proposed for automatic extraction of paralinguistic information (like speech acts, attitudes and emotions) in dialog speech.
A dimensional approach to vocal expression of emotion
This study explored a dimensional approach to vocal expression of emotion. Actors vocally portrayed emotions (anger, disgust, fear, happiness, sadness) with weak and strong emotion intensity.
Expressing vocal effort in concatenative synthesis
Two hypotheses are verified in perception experiments: (I) the three diphone sets are perceived as belonging to the same speaker; (II) the vocal effort intended during database recordings is perceived in the synthetic voice.
Prosodic Structure Affects the Production and Perception of Voice-Assimilated German Fricatives
Prosodic structure has long been known to constrain phonological processes [1]. More recently, it has a lso been recognized as a source of fine-grained phonetic var iation of speech sounds. In