Sanshzar Kettebekov

Learn More
This paper presents a framework for designing a natural multimodal human computer interaction (HCI) system. The core of the proposed framework is a principled method for combining information derived from audio and visual cues. To achieve natural interaction, both audio and visual modalities are fused along with feedback through a large screen display.(More)
* Corresponding author. Abstract This paper presents a multimodal crisis management system (XISM). It employs processing of natural gesture and speech commands elicited by a user to efficiently manage complex dynamic emergency scenarios on a large display. The developed prototype system demonstrates the means of incorporating unconstrained free-hand(More)
Although speech and gesture recognition has been studied extensively all the successful attempts of combining them in the unified framework were semantically motivated, e.g., keyword co-occurrence. Such formulations inherited the complexity of natural language processing. This paper presents a statistical approach that uses physiological phenomenon of(More)
Although recognition of natural speech and gestures have been studied extensively, previous attempts of combining them in a unified framework to boost classification were mostly semantically motivated, e.g., keyword-gesture co-occurrence. Such formulations inherit the complexity of natural language processing. This paper presents a Bayesian formulation that(More)
Despite recent advances in gesture recognition, reliance on the visual signal alone to classify unrestricted continuous gesticulation is inherently error-prone. Since spontaneous gesticulation is mostly coverbal in nature, there have been some attempts of using speech cues to improve gesture recognition. Some attempts have been made in using speech cues to(More)
  • 1