James D. Edge

Learn More
Recognition of expressed emotion from speech and facial gestures was investigated in experiments on an audio-visual emotional database. A total of 106 audio and 240 visual features were extracted and then features were selected with Plus l-Take Away r algorithm based on Bhattacharyya distance criterion. In the second step, linear transformation methods,(More)
Motion capture (mocap) data is commonly used to recreate complex human motions in computer graphics. Markers are placed on an actor, and the captured movement of these markers allows us to animate computer-generated characters. Technologies have been introduced which allow this technique to be used not only to retrieve rigid body transformations, but also(More)
Data-driven approaches to 2D facial animation from video have achieved highly realistic results. In this paper we introduce a process for visual speech synthesis from 3D video capture to reproduce the dynamics of 3D face shape and appearance. Animation from real speech is performed by path optimisation over a graph representation of phonetically segmented(More)
We describe a method for the synthesis of visual speech movements using a hybrid unit selection/model-based approach. Speech lip movements are captured using a 3D stereo face capture system and split up into phonetic units. A dynamic parameterisation of this data is constructed which maintains the relationship between lip shapes and velocities; within this(More)
The animation of facial expression has become a popular area of research in the past ten years, in particular with its application to avatar technology and naturalistic user interfaces. In this paper we describe a method to animate speech from small fragments of motion-captured sentences. A dataset of domain-specific sentences are captured and phonetically(More)
This sketch concerns the animation of facial movement during speech production. In this work we consider speech gestures as trajectories through a space containing all visible vocal tract postures. Within this visible speech space, visual-phonemes (or visemes) are defined as collections of vocal tract postures which produce similar speech sounds (i.e. an(More)
In this paper we describe a technique for animating visual speech by concatenating small fragments of speech movements. The technique is analogous to the most effective audio synthesis techniques which form utterances by blending small fragments of speech waveforms. Motion and audio data is initially captured to cover the target synthesis domain; this data(More)