Learn More
We integrate the cascade-of-rejectors approach with the Histograms of Oriented Gradients (HoG) features to achieve a fast and accurate human detection system. The features used in our system are HoGs of variable-size blocks that capture salient features of humans automatically. Using AdaBoost for feature selection, we identify the appropriate set of blocks,(More)
Large scale video copy detection tasks require a compact and computational-efficient descriptor that is robust to various transformations that are typically applied to generate copies. In this paper, we propose a new frame-level descriptor for such a task. The descriptor encodes the internal structure of a video frame by computing the pair-wise correlations(More)
Automatic evaluation of photo aesthetic quality is a challenging problem in multimedia computing. Numerous aesthetic features have been proposed in previous works but the features are extracted solely from the photo under evaluation. In this paper, we explore the use of multiple images, and present the relative features that can be easily computed from any(More)
Conventional image categorization techniques primarily rely on low-level visual cues. In this paper, we describe a multimodal fusion scheme which improves the image classification accuracy by incorporating the information derived from the embedded texts detected in the image under classification. Specific to each image category, a text concept is first(More)
In this paper, we report our experiments using a real-world image dataset to examine the effectiveness of Isomap, LLE and KPCA. The 1,897-image dataset we used consists of 14 image categories. We have used this dataset in several settings, both supervised and unsupervised, and have found it to be relatively " well behaved " (clusters do exist in a(More)
We present an approach to measuring similarities between visual data based on approximate string matching. In this approach, an image is represented by an ordered list of feature descriptors. We show the extraction of local features sequences from two types of 2-D signals - scene and shape images. The similarity of these two images is then measured by 1)(More)
Face detection and recognition have numerous multimedia applications of broad interest, one of which is automatic face annotation. There exist many robust algorithms tackling these problems but most of these algorithms are computationally demanding and have only been implemented in PC- or server-based environments. In this demonstration we show a real-time(More)