Illusions and Issues In Bimodal Speech Perception


As witnessed by this conference and many other sources of evidence, the study of bimodal speech perception has attained the status of a cottage industry. The addition of just one more modality has made transparent several new phenomena, new theoretical endeavors, and a closer link between research and application. The goal of this paper is to review a series of relevant issues in our search for an understanding of speech perception by ear and eye. The issues include a discussion of viable explanations of the McGurk effect, the time course of auditory/visual processing, neural processing, the role of dynamic information, the information in visual speech, the fusion of written language and auditory speech, and the issue of generalizing from studies of syllables to words and larger segments. 1. SETTING THE STAGE It has been well over two decades since the publication of hearing lips and seeing voices by the late Harry McGurk and his colleague John McDonald [1]. The so-called McGurk effect has obtained widespread attention in many circles of psychological inquiry and cognitive science. The classic McGurk effect involves the situation in which an auditory /ba/ is paired with a visible /ga/ and the perceiver reports hearing /da/. The reverse pairing, an auditory /ga/ and visual /ba/, tends to produce a perceptual judgment of /bga/. We should not be surprised by the finding that auditory experience is influenced by the visual input. Certainly the McGurk effect was not the first crosstalk between modalities to be observed. We seem to have a little voice, not necessarily our own, in our heads as we read written language. Why do people watching a large screen in a movie theater hear an actor's voice coming from his face, even though the audio speakers are on the side of the screen? (This experience is equally powerful in theaters without stereoscopic sound, where indeed the auditory message has no information tying the sound to the actor.) This so-called visual capture is also exploited by ventriloquists, who contrary to popular belief, do not throw their voice at the puppet. The visual input changes our auditory experience of the location of the sound we are hearing. This situation represents a clear case of cross talk between modalities [2]. We should be relieved that the McGurk effect resembles other avenues of experience, such as localizing sound in space. Its similarity to other domains offers the hope of a more general account of sensory fusion and modality specific experience rather than one unique to speech perception by ear and eye. This result might simply mean that we cannot trust modality-specific experience as a direct index of processing within that modality. Speech information from two modalities provides a situation in which the brain combines both sources of information to create an interpretation that is easily mistaken for an auditory one. We believe we hear the speech because perhaps spoken language is usually heard. We are attracted to bimodal speech perception as a paradigm for psychological inquiry for several reasons. It offers a compelling example of how processing information from one modality (vision) influences our experience in another modality (audition). Second, it provides a unique situation in which multiple modalities appear to be fused or integrated in a natural manner. Third, experimental manipulation of these two sources of information is easily carried out. Finally, the research project has the potential for valuable applications for individuals with hearing loss and for other domains of language learning. 1.1. A Downside to Current Inquiry Many investigators have been misled by the traditional study of the McGurk effect. First of all it is not reasonable for an investigator to study an effect. For example, it would be foolish for someone to say I study the Ebbinghaus illusion. One investigates illusions to gain some insights into perceptual processing, not simply for the study of illusions. Similarly, it is important to keep in mind that the study of the McGurk effect should be aimed at understanding how we perceive speech. Focusing on the illusion tends to compromise the type of experimental study that is implemented. Most studies of the McGurk effect use just a few experimental conditions in which the auditory and visual sources of information are made to mismatch. Investigators also sometimes fail to test the Massaro, D. W. (1998). Illusions and Issues in Bimodal Speech Perception. Proceedings of Auditory Visual Speech Perception ’98. (pp. 21-26). Terrigal-Sydney Australia, December, 1998.

Extracted Key Phrases

3 Figures and Tables

Cite this paper

@inproceedings{Massaro1998IllusionsAI, title={Illusions and Issues In Bimodal Speech Perception}, author={Dominic W. Massaro}, booktitle={AVSP}, year={1998} }