Antonios Oikonomopoulos

Learn More
This paper addresses the problem of human action recognition by introducing a sparse representation of image sequences as a collection of spatiotemporal events that are localized at points that are salient both in space and time. We detect the spatiotemporal salient points by measuring the variations in the information content of pixel neighborhoods not(More)
This paper addresses the problem of human-action recognition by introducing a sparse representation of image sequences as a collection of spatiotemporal events that are localized at points that are salient both in space and time. The spatiotemporal salient points are detected by measuring the variations in the information content of pixel neighborhoods not(More)
In this paper we address the problem of localization and recognition of human activities in unsegmented image sequences. The main contribution of the proposed method is the use of an implicit representation of the spatiotemporal shape of the activity which relies on the spatiotemporal localization of characteristic ensembles of feature descriptors. Evidence(More)
In this paper we address the problem of localisation and recognition of human activities in unsegmented image sequences. The main contribution of the proposed method is the use of an implicit representation of the spatiotemporal shape of the activity which relies on the spatiotemporal lo-calization of characteristic, sparse, 'visual words' and 'vi-sual(More)
The extraction and quantization of local image and video descriptors for the subsequent creation of visual codebooks is a technique that has proved extremely effective for image and video retrieval applications. In this paper we build on this concept and extract a new set of visual descriptors that are derived from spatiotemporal salient points detected on(More)
This paper addresses the problem of human action recognition by introducing a sparse representation of image sequences as a collection of spatiotemporal events that are localized at points that are salient both in space and time. We detect the spatiotemporal salient points by measuring changes in the information content of pixel neighborhoods not only in(More)
The extraction and quantization of local image and video descriptors for the subsequent creation of visual codebooks is a technique that has proved very effective for image and video retrieval applications. In this paper we build on this concept and propose a new set of visual descriptors that provide a local space-time description of the visual activity.(More)
This work addresses the problem of human action recognition by introducing a representation of a human action as a collection of short trajectories that are extracted in areas of the scene with significant amount of visual activity. The trajectories are extracted by an auxiliary particle filtering tracking scheme that is initialized at points that are(More)
In this paper we propose a tracking scheme specifically tailored for tracking human body parts in cluttered scenes. We model the background and the human skin using Gaussian Mixture Models and we combine these estimates to localize the features to be tracked. We further use these estimates to determine the pixels which belong to the background and those(More)
In this paper we address the problem of human activity modelling and recognition by means of a hierarchical representation of mined dense spatiotem-poral features. At each level of the hierarchy, the proposed method selects feature constellations that are increasingly discriminative and characteristic of a specific action category, by taking into account(More)