Creating Simplified Version of Lip Database based on Front View of Face

  title={Creating Simplified Version of Lip Database based on Front View of Face},
  author={Ritesh A. Magre and Ajit S. Ghodke},
  journal={International Journal of Computer Applications},
Recently lot of work has been done on audio visual speech recognition but less work has been done on visual speech and speaker recognition. This research belongs to human computer interaction (HCI) domain. HCI makes human computer interaction simple. This paper represents the creating of database of visual speech and speaker in English language and preprocessing of it to improve recognition accuracy. We have studied Tulipse1 database, AV Database and CUAVE Database on the basis of these… Expand


Design and Recording of Czech Audio-Visual Database with Impaired Conditions for Continuous Speech Recognition
Database introduced in this paper can be used for testing of visual parameterization in audio-visual speech recognition (AVSR). Expand
Audio-visual speech recognition for difficult environments
Overall, the addition of visual features is shown to improve upon audio-only performance in noisy and multispeaker environments, and techniques are presented that yield improved speech-reading performance for moving talkers. Expand
Improving connected letter recognition by lipreading
The authors show how recognition performance in automated speech perception can be significantly improved by additional lipreading, so called speech-reading. They show this on an extension of aExpand
Features for Audio-Visual Speech Recognition
Five new lipreading techniques are evaluated using a hidden Markov model based visual-only recognition task and compared with an enhanced implementation of a previous lip contour tracker, finding the addition of visual information to automatic speech recognition is found to improve accuracy and is most pronounced in acoustically noisy conditions. Expand
Lipreading Using Shape, Shading and Scale
A nontracked alternative is a nonlinear transform of the image using a multiscale spatial analysis (MSA), which performs almost identically to AAM’s in both visual and audio-visual recognition tasks on a multi-talker database of isolated letters. Expand
Real-time lip tracking and bimodal continuous speech recognition
The experiments show that the bimodal recognizer compares favorably to the acoustic-only counterpart, and the results indicate that it is advantageous to include first derivatives of the visual features. Expand
Visual Speech Recognition with Stochastic Networks
The results indicate that simple hidden Markov models may be used to successfully recognize relatively unprocessed image sequences, and the system achieved performance levels equivalent to untrained humans when asked to recognize the first four English digits. Expand
On the Integration of Auditory and Visual Parameters in an HMM-based ASR
A model which can improve the performances of an audio-visual speech recognizer in an isolated word and speaker dependent situation is proposed by using a hybrid system based on two HMMs trained respectively with acoustic and optic data. Expand
Robust speech recognition and feature extraction using HMM2
It will be shown that HMM2 can be used to extract noise robust features, supposed to be related to formant regions, which can been used as extra features for traditional HMM recognizers to improve their performance. Expand
Design issues for a digital audio-visual integrated database
The design issues for the DAVID audio- visual database are presented, a number of audio-visual databases are summarised, and the database design issues are discussed. Expand