Aktract-This paper gives a detailed account of a system environment for the treatment of general problems of image and speech understanding. It provides a framework for the representation of declarative and procedural knowledge based on a suitable definition of a semantic network. The syntax and semantics of the network are clearly defined. In addition, the(More)
The idea of using articulatory representations for automatic speech recognition (ASR) continues to attract much attention in the speech community. Representations which are grouped under the label ''articulatory'' include artic-ulatory parameters derived by means of acoustic-articulatory transformations (inverse filtering), direct physical measurements or(More)
In order to enable the widespread use of robots in home and office environments, systems with natural interaction capabilities have to be developed. A prerequisite for natural interaction is the robot's ability to automatically recognize when and how long a person's attention is directed towards it for communication. As in open environments several persons(More)
  • A Haasch, S Hohenner, S Hüwel, M Kleinehagenbrock, S Lang, I Toptsis +4 others
  • 2004
In the recent past, service robots that are able to interact with humans in a natural way have become increasingly popular. A special kind of service robots that are designed for personal use at home are the so-called robot companions. They are expected to communicate with non-expert users in natural and intuitive way. For such natural interactions with(More)
BACKGROUND When our PC goes on strike again we tend to curse it as if it were a human being. Why and under which circumstances do we attribute human-like properties to machines? Although humans increasingly interact directly with machines it remains unclear whether humans implicitly attribute intentions to them and, if so, whether such interactions resemble(More)
The ability to robustly track a person is an important prerequisite for human-robot-interaction. This paper presents a hybrid approach for integrating vision and laser range data to track a human. The legs of a person can be extracted from laser range data while skin-colored faces are detectable in camera images showing the upper body part of a person. As(More)
The combination of multiple speech recognizers based on different signal representations is increasingly attracting interest in the speech community. In previous work we presented a hybrid speech recognition system based on the combination of acoustic and ar-ticulatory information which achieved significant word error rate reductions under highly noisy(More)
Although mosaics are well established as a compact and non-redundant representation of image sequences, their application still suffers from restrictions of the camera motion or has to deal with parallax errors. We present an approach that allows construction of mosaics from arbitrary motion of a head-mounted camera pair. As there are no par-allax errors(More)
In this paper we propose to recognize manipulative hand gestures by incorporating symbolic constraints in a particle filtering approach used for trajectory-based activity recognition. To this end, the notion of situational and spatial context of a gesture is introduced. This scene context is incorporated during the analysis of the trajectory data. A first(More)