Learn More
Human action recognition is an important yet challenging task. The recently developed commodity depth sensors open up new possibilities of dealing with this problem but also present some unique challenges. The depth maps captured by the depth cameras are very noisy and the 3D positions of the tracked joints may be completely wrong if serious occlusions(More)
We study the problem of action recognition from depth sequences captured by depth cameras, where noise and occlusion are common problems because they are captured with a single commodity camera. In order to deal with these issues, we extract semi-local features called random occupancy pattern (ROP) features, which employ a novel sampling scheme that(More)
Salient object detection is not a pure low-level, bottom-up process. Higher-level knowledge is important even for task-independent image saliency. We propose a unified model to incorporate traditional low-level features with higher-level guidance to detect salient objects. In our model , an image is represented as a low-rank matrix plus sparse noises in a(More)
Actions are spatio-temporal patterns which can be characterized by collections of spatio-temporal invariant features. Detection of actions is to find the re-occurrences (e.g. through pattern matching) of such spatio-temporal patterns. This paper addresses two critical issues in pattern matching-based action detection: (1) efficiency of pattern search in 3D(More)
PROBLEM Fine-grained image similarity, for images with the same category. It is for image-search application, defined by triplets. Query Positive Negative • image similarities are defined subtle difference. • it is more difficult to obtain triplet training data. • we would like to train a model directly from images instead of rely on the hand-crafted(More)
A visual word lexicon can be constructed by clustering primitive visual features, and a visual object can be described by a set of visual words. Such a " bag-of-words " representation has led to many significant results in various vision tasks including object recognition and catego-rization. However, in practice, the clustering of primitive visual features(More)
Enormous uncertainties in unconstrained environments lead to a fundamental dilemma that many tracking algorithms have to face in practice: Tracking has to be computationally efficient, but verifying whether or not the tracker is following the true target tends to be demanding, especially when the background is cluttered and/or when occlusion occurs. Due to(More)