Darren Cosker

Learn More
In this paper we investigate the problem of integrating the complementary audio and visual modalities for speech separation. Rather than using independence criteria suggested in most blind source separation (BSS) systems, we use the visual feature from a video signal as additional information to optimize the unmixing matrix. We achieve this by using a(More)
This paper presents the first dynamic 3D FACS data set for facial expression research, containing 10 subjects performing between 19 and 97 different AUs both individually and in combination. In total the corpus contains 519 AU sequences. The peak expression frame of each sequence has been manually FACS coded by certified FACS experts. This provides a ground(More)
In this paper we present the first Facial Action Coding System (FACS) valid model to be based on dynamic 3D scans of human faces for use in graphics and psychological research. The model consists of FACS Action Unit (AU) based parameters and has been independently validated by FACS experts. Using this model, we explore the perceptual differences between(More)
Detecting cooperative partners in situations that have financial stakes is crucial to successful social exchange. The authors tested whether humans are sensitive to subtle facial dynamics of counterparts when deciding whether to trust and cooperate. Participants played a 2-person trust game before which the facial dynamics of the other player were(More)
We examined the effects of the temporal quality of smile displays on impressions and decisions made in a simulated job interview. We also investigated whether similar judgments were made in response to synthetic (Study 1) and human facial stimuli (Study 2). Participants viewed short video excerpts of female interviewees exhibiting dynamic authentic smiles,(More)
In this paper a technique is presented for learning audio-visual correlations in non-speech related articulations such as laughs, cries, sneezes and yawns, such that accurate new visual motions may be created given just audio. Our underlying model is data-driven and provides reliable performance given voices the system is familiar with as well as new(More)
We introduce a video-based approach for producing water surface models. Recent advances in this field output high-quality results but require dedicated capturing devices and only work in limited conditions. In contrast, our method achieves a good tradeoff between the visual quality and the production cost: It automatically produces a visually plausible(More)
We present a system capable of producing video-realistic videos of a speaker given audio only. The audio input signal requires no phonetic labelling and is speaker independent. The system requires only a small training set of video to achieve convincing realistic facial synthesis. The system learns the natural mouth and face dynamics of a speaker to allow(More)