Audiovisual speech recognition using multiscale nonlinear image decomposition

@inproceedings{Matthews1996AudiovisualSR,
  title={Audiovisual speech recognition using multiscale nonlinear image decomposition},
  author={Iain A. Matthews and J. Andrew Bangham and Stephen J. Cox},
  booktitle={ICSLP},
  year={1996}
}
There has recently been increasing interest in the idea of enhancing speech recognition by the use of visual information derived from the face of the talker. This paper demonstrates the use of nonlinear image decomposition, in the form of a ‘sieve’, applied to the task of visual speech recognition. Information derived from the mouth region is used in visual and audiovisual speech recognition of a database of the letters A-Z for four talkers. A scale histogram is generated directly from the… CONTINUE READING
Highly Cited
This paper has 50 citations. REVIEW CITATIONS

4 Figures & Tables

Topics

Statistics

0246'97'99'01'03'05'07'09'11'13'15'17
Citations per Year

51 Citations

Semantic Scholar estimates that this publication has 51 citations based on the available data.

See our FAQ for additional information.