Samira Sheikhi

Learn More
We introduce a new multimodal interaction dataset with extensive annotations in a conversational Human-RobotInteraction (HRI) scenario. It has been recorded and annotated to benchmark many relevant perceptual tasks, towards enabling a robot to converse with multiple humans, such as speaker localization, key word spotting, speech recognition in audio domain;(More)
The ability to recognize the Visual Focus of Attention (VFOA, i.e. what or whom a person is looking at) of people is important for robots or conversational agents interacting with multiple people, since it plays a key role in turn-taking, engagement or intention monitoring. As eye gaze estimation is often impossible to achieve, most systems currently rely(More)
We address the recognition of people’s visual focus of attention (VFOA), the discrete version of gaze that indicates who is looking at whom or what. As a good indicator of addressee-hood (who speaks to whom, and in particular is a person speaking to the robot) and of people’s interest, VFOA is an important cue for supporting dialog modelling in Human-Robot(More)
In the last couple of years more and more multimodal corpora have been created. Recently many of these corpora have also included RGB-D sensors' data. However, there is to our knowledge no publicly available corpus, which combines accurate gaze-tracking, and high-quality audio recording for group discussions of varying dynamics. With a corpus that would(More)
We introduce a new conversational Human-Robot-Interaction (HRI) dataset with a real-behaving robot inducing interactive behavior with and between humans. Our scenario involves a humanoid robot NAO<sup>1</sup> explaining paintings in a room and then quizzing the participants, who are naive users. As perceiving nonverbal cues, apart from the spoken words,(More)
This paper addresses the recognition of people's visual focus of attention (VFOA), the discrete version of gaze indicating who is looking at whom or what. In absence of high definition images, we rely on people's head pose to recognize the VFOA. To the contrary of most previous works that assumed a fixed mapping between head pose directions and gaze target(More)
In this paper we evaluate and compare six well-known foreground from background subtraction methods against a standard database. To be able to compare these algorithms objectively, we have chosen three challengeable scenarios from this database. The algorithms were applied to image sequences of length 100 to 800 frames. We examined the results thoroughly(More)
Prediction using pure standards is expected to be biased whenever the slope of the calibration is affected by the presence of sample matrix. Moreover, in the presence of unknown spectral interferents, first-order algorithms like partial least squares cannot be used. In this study, a method for determination of carvedilol (CAR) in tablet and urine samples is(More)
The Visual Focus of Attention (what or whom a person is looking at) or VFOA is a fundamental cue in non-verbal communication and plays an important role when designing effective human-machine interaction systems. However, recognizing the VFOA of an interacting person is difficult for a robot, since due to low resolution imaging, eye gaze estimation is not(More)
The paper investigates the problem of addressee recognition -to whom a speaker's utterance is intended- in a setting involving a humanoid robot interacting with multiple persons. More specifically, as it is well known that addressee can primarily be derived from the speaker's visual focus of attention (VFOA) defined as whom or what a person is looking at,(More)