Sanshzar Kettebekov

Learn More
This paper presents a framework for designing a natural multimodal human computer interaction (HCI) system. The core of the proposed framework is a principled method for combining information derived from audio and visual cues. To achieve natural interaction, both audio and visual modalities are fused along with feedback through a large screen display.(More)
Although recognition of natural speech and gestures have been studied extensively, previous attempts of combining them in a unified framework to boost classification were mostly semantically motivated, e.g., keyword-gesture co-occurrence. Such formulations inherit the complexity of natural language processing. This paper presents a Bayesian formulation that(More)
This paper presents a multimodal crisis management system (XISM). It employs processing of natural gesture and speech commands elicited by a user to efficiently manage complex dynamic emergency scenarios on a large display. The developed prototype system demonstrates the means of incorporating unconstrained free-hand gestures and speech in a real-time(More)
Although speech and gesture recognition has been studied extensively all the successful attempts of combining them in the unified framework were semantically motivated, e.g., keyword co-occurrence. Such formulations inherited the complexity of natural language processing. This paper presents a statistical approach that uses physiological phenomenon of(More)
  • 1