There has been a large amount of research on speech driven face animation. Particularly, recently research efforts have been demonstrated that the hidden Markov model techniques could achieve a high level of success in the field of audio/visual mapping without language information. In this paper, firstly a linear model based facial representation method was(More)
Edge detection has been widely used in medical image processing, automatic diagnosis, et al. A novel edge detection algorithm, based on the fusion model, is proposed by combination with the two proposed models as follows: the matrix of most probable distribution of edge point and the matrix of the difference weight of each point. The most probable(More)
Edge detection is a long standing but still challenging problem. Although there are many effective edge detectors, none of them can obtain ideal edges in every situation. To make the results robust for any image, we propose a new edge detection algorithm based on a two-level fusion model that combines several typical edge detectors together with new(More)
The image-based lip animation synthesis approach is one kind of promising method that synthesizes the believable talking head. This paper seeks to show an improvement in the accuracy of mouth prediction with the speech stimulus, as well as showing the method used to extract the speaking mouth correlative speech feature. Our lip animation synthesis system is(More)
In this paper, we propose a type of joint feature with geometric parameters and color moments to represent the speaking-mouth frames for image-based visual speech synthesis systems. Based on FDP around the mouth area, the geometric feature is obtained by computing Euclidean distances to describe the width of the speaking mouth, the height of the outer and(More)
The problem of visual speech representation for bimodal based speech recognition includes particular challenges in the modeling of the inner lip texture reflecting different pronunciations, such as the appearance of teeth and tongue. This paper proposes and analyzes several possible statistical inner lip texture descriptors to determine an effective and(More)
Facing the requirement of the virtual pedagogy application to have the ability of evaluating English learners' pronunciation quality, the paper proposes an automatic assessment method based on a bimodal fusion decision algorithm. The pronunciation level is scored by comparing the similarity between learner and standard's audio and video speech signals(More)