Automatic Segmentation for Emotional Feature Extraction from Spoken Sentence


Perception of speaker’s emotion is one of interesting issues in human-robot interaction. Especially, friendly and instinctive interface between robots and humans is required for making service robots useful to inexpert interacting with robots. Among several mode in communications, speech is easiest method for human because speech is fundamental communication tool in human-human interaction. However, continuous speech is difficult to extract some features because speech is time-variant signal. In other words, segmentation is necessary to analyze speech signal easier. Researcher who were interested in phonetic information usually used 20~40ms windowing because they should extract features in short duration which gives an assumption about time-invariant in a frame. On other hand, emotions in speech are hard to be revealed in short duration because emotion does not change rapidly as phonetic feature does. Therefore, automatic segmentation for emotion feature extraction is proposed in this paper. Automatic segmentation is used for estimating boundaries between phonemes based on “spectral variation function method” and grouping phonemes based on “average energies per frames”. In simulation result, it showed that automatic segmentation is useful for emotional feature extraction from spoken sentence.

9 Figures and Tables

Cite this paper

@inproceedings{Hyun2008AutomaticSF, title={Automatic Segmentation for Emotional Feature Extraction from Spoken Sentence}, author={Kyung Hak Hyun and Yoon Keun Kwak}, year={2008} }