Stavros Theodorakis

Learn More
We explore novel directions for incorporating phonetic transcriptions into sub-unit based statistical models for sign language recognition. First, we employ a new symbolic processing approach for converting sign language annotations, based on HamNoSys symbols, into structured sequences of labels according to the Posture-Detention-Transition-Steady Shift(More)
We address multistream sign language recognition and focus on efficient multistream integration schemes. Alternative approaches are investigated and the application of Product-HMMs (PHMM) is proposed. The PHMM is a variant of the general multistream HMM that also allows for partial asynchrony between the streams. Experiments in classification and isolated(More)
We propose and investigate a framework that utilizes novel aspects concerning probabilistic and morphological visual processing for the segmentation, tracking and handshape modeling of the hands, which is used as front-end for sign language video analysis. Our ultimate goal is to explore the automatic Handshape Sub-Unit (HSU) construction and moreover the(More)
We investigate the issue of sign language automatic phonetic sub-unit modeling, that is completely data driven and without any prior phonetic information. A first step of visual processing leads to simple and effective region-based visual features. Prior to the sub-unit modeling we propose to employ a pronunciation clustering step with respect to each sign.(More)
The visual processing of Sign Language (SL) videos offers multiple interdisciplinary challenges for image processing and recognition. Based on tracking and visual feature extraction, we investigate SL visual phonetic modeling by exploiting statistical subunit (SU) models of movement-position and handshape. We further propose a new framework to construct a(More)
We propose a novel affine-invariant modeling of hand shape-appearance images, which offers a compact and descriptive representation of the hand configurations. Our approach combines: 1) A hybrid representation of both shape and appearance of the hand that models the handshapes without any landmark points. 2) Modeling of the shape-appearance images with a(More)
We investigate the automatic phonetic modeling of sign language based on phonetic sub-units, which are data driven and without any prior phonetic information. Visual processing is based on a probabilistic skin color model and a framewise geodesic active contour segmentation; occlusions are handled by a forward-backward prediction component leading finally(More)
We propose the novel approach of dynamic affine-invariant shape-appearance model (Aff-SAM) and employ it for handshape classification and sign recognition in sign language (SL) videos. Aff-SAM offers a compact and descriptive representation of hand configurations as well as regularized model-fitting, assisting hand tracking and extracting handshape(More)
We present a new framework for multimodal gesture recognition that is based on a two-pass fusion scheme. In this, we deal with a demanding Kinect-based multimodal dataset, which was introduced in a recent gesture recognition challenge. We employ multiple modalities , i.e., visual cues, such as colour and depth images, as well as audio, and we specifically(More)
We explore the integration of movement-position (MP) and handshape (HS) cues for sign language recognition. The proposed method combines the data-driven subunit (SU) modeling exploiting the dynamic-static notion for MP and the affine shape-appearance SUs for HS configurations. These aspects lead to the new dynamic-static integration of manual cues. This(More)