Michael M. Cohen

Learn More
8, H,R, Markiis and S. Kitayama, Culture and the self: Implications for cognition, emotion, and motivation, Psychological Review. 98, 224-253 (1991); H,C, Triandis. The self and social behavior in differing cultural contexts, Psychological Review, 96, 506-520 (1989); R.A, Shwederand M,A, Sullivan, Cultural psychology: Who needs it? Annual Review of(More)
This paper presents an initial implementation and evaluation of a system that synthesizes visual speech directly from the acoustic waveform. An artificial neural network (ANN) was trained to map the cepstral coefficients of an individual’s natural speech to the control parameters of an animated synthetic talking head. We trained on two data sets; one was a(More)
A set of freely available, universal speech tools is needed to accelerate progress in the speech technology. The CSLU Toolkit represents an effort to make the core technology and fundamental infrastructure accessible, affordable and easy to use. The CSLU Toolkit has been under development for five years. This paper describes recent improvements, additions(More)
Given the importance of visible information in face-toface communication, visible speech synthesis is being developed to control and manipulate visible speech. Experiments have shown that this visible speech is particularly important when the auditory speech is degraded, because of noise, bane:width filtering, or hearing impairment (Massaro, 1987). The(More)
An import~.nt question in speech perception is whether listeners have continuous or categorical information about the acoustic signal in speech. Most traditional experimental studies have been interpreted as evidence for categorical perception. It is also argued in the present paper that more recent results taken as evidence against categorical perception(More)
We report on our recent facial animation work to improve the realism and accuracy of visual speech synthesis. The general approach is to use both static and dynamic observations of natural speech to guide the facial modeling. One current goal is to model the internal articulators of a highly realistic palate, teeth, and an improved tongue. Because our(More)
Animated agents are becoming increasingly frequent in research and applications in speech science. An important challenge is to evaluate the effectiveness of the agent in terms of the intelligibility of its visible speech. Sumby and Pollack (1954) proposed a metric to describe the benefit provided by the face relative to the auditory speech presented alone.(More)
Subjects naturally integrate auditory and visual information in bimodal speech perception. To assess the robustness of the integration process, the relative onset time of the audible and visible sources was systematically varied. In the first experiment, bimodal syllables composed of the auditory and visible sy l l ab le s /ba / a n d / d a / w e r e(More)
Animated agents are becoming increasingly frequent in research and applications in speech science. An important challenge is to evaluate the effectiveness of the agent in terms of the intelligibility of its visible speech. In three experiments, we extend and test the Sumby and Pollack (1954) metric to allow the comparison of an agent relative to a standard(More)