Jacek C. Wojdel

Learn More
This paper describes the gathering and availability of an audiovisual speech corpus for Dutch language. The corpus was prepared with the multi-modal speech recognition in mind and it is currently used in our research on lipreading and bimodal speech recognition. It contains the prompts used also in the well established POLYPHONE corpus and therefore(More)
In this paper we present the lipreading experiments with different sets of the features extracted from the video sequence. In our experiments we use a simple color based filtering techniques to extract the feature vectors from the incoming video signal. Some of those features are directly related to the geometrical properties of the lips (their position and(More)
The current audio-only speech recognition still lacks the expected robustness when the Signal to Noise Ratio(SNR) decreases. The video information is not affected by noise which makes it an ideal candidate for data fusion for speech recognition benefit. In the paper [1] the authors have shown that most of the techniques used for extraction of static visual(More)
This paper presents our experiments on continuous audiovisual speech recognition. A number of bimodal systems using feature fusion or fusion within Hidden Markov Models are implemented. Experiments with different fusion techniques and their results are presented. Further the performance levels of the bimodal system and a unimodal speech recognizer under(More)
In this paper we present how to implement the co-occurrence rules defined by psychologist Paul Ekman in a computer animated face. The rules describe the dependencies between the atomic observable movements of the human face (so called Action Units). They are defined in a form suitable for a human observer who needs to produce a consistent binary scoring of(More)
  • 1