Learn More
In this paper, we present a home-monitoring oriented human activity recognition benchmark database, based on the combination of a color video camera and a depth sensor. Our contributions are two-fold: 1) We have created a publicly releasable human activity video database (i.e., named as RGBD-HuDaAct), which contains synchronized color-depth video streams,(More)
In this paper, we present an <i>automatic</i> web image mining system towards building a <i>universal</i> human age estimator based on facial information, which is applicable to all ethnic groups and various image qualities. First, a large (<391k) yet noisy human aging image dataset is crawled from the photo sharing website <i>Flickr</i> and <i>Google</i>(More)
In this paper, we propose an intelligent photography system, which automatically and professionally generates/recommends user-favorite photo(s) from a wide view or a continuous view sequence. This task is quite challenging given that the evaluation of photo quality is under-determined and usually subjective. Motivated by the recent prevalence of online(More)
The aim of this paper is to address the problem of recognizing human group activities in surveillance videos. This task has great potentials in practice, however was rarely studied due to the lack of benchmark database and the difficulties caused by large intra-class variations. Our contributions are two-fold. Firstly, we propose to encode the(More)
—Automated scene analysis has been a topic of great interest in computer vision and cognitive science. Recently, with the growth of crowd phenomena in the real world, crowded scene analysis has attracted much attention. However, the visual occlusions and ambiguities in crowded scenes, as well as the complex behaviors and scene semantics, make the analysis a(More)
Recognizing complex human activities usually requires the detection and modeling of individual visual features and the interactions between them. Current methods only rely on the visual features extracted from 2-D images, and therefore often lead to unreliable salient visual feature detection and inaccurate modeling of the interaction context between(More)
We investigate the feature design and classification architectures in temporal action localization. This application focuses on detecting and labeling actions in untrimmed videos, which brings more challenge than classifying presegmented videos. The major difficulty for action localization is the uncertainty of action occurrence and utilization of(More)
Along with the explosive growth of multimedia data, automatic multimedia tagging has attracted great interest of various research communities, such as computer vision, multimedia, and information retrieval. However, despite the great progress achieved in the past two decades, automatic tagging technologies still can hardly achieve satisfactory performance(More)