Learn More
Key frame extraction has been recognized as one of the important research issues in video information retrieval. Although progress has been made in key frame extraction, the existing approaches are either compu-tationally expensive or ineeective i n capturing salient visual content. In this paper, we rst discuss the importance of key frame selection; and(More)
In this paper, we propose a new image clustering algorithm, referred to as clustering using local discriminant models and global integration (LDMGI). To deal with the data points sampled from a nonlinear manifold, for each data point, we construct a local clique comprising this data point and its neighboring data points. Inspired by the Fisher criterion, we(More)
Recently, deep learning approach, especially deep Convolutional Neural Networks (ConvNets), have achieved overwhelming accuracy with fast processing speed for image classification. Incorporating temporal structure with deep ConvNets for video representation becomes a fundamental problem for video content analysis. In this paper, we propose a new approach,(More)
With the development of Motion capture techniques, more and more 3D motion libraries become available. In this paper, we present a novel content-based 3D motion retrieval algorithm. We partition the motion library and construct a motion index tree based on a hierarchical motion description. The motion index tree serves as a classifier to determine the(More)
A better similarity mapping function across heterogeneous high-dimensional features is very desirable for many applications involving multi-modal data. In this paper, we introduce coupled dictionary learning (DL) into supervised sparse coding for multi-modal (cross-media) retrieval. We call this Supervised coupled-dictionary learning with group structures(More)
To reveal and leverage the correlated and complemental information between different views, a great amount of multi-view learning algorithms have been proposed in recent years. However, unsupervised feature selection in multi-view learning is still a challenge due to lack of data labels that could be utilized to select the discriminative features. Moreover,(More)
Automatically recognizing a large number of action categories from videos is of significant importance for video understanding. Most existing works focused on the design of more discriminative feature representation , and have achieved promising results when the positive samples are enough. However, very limited efforts were spent on recognizing a novel(More)