Learn More
This paper introduces a web image dataset created by NUS's Lab for Media Search. The dataset includes: (1) 269,648 images and the associated tags from Flickr, with a total of 5,018 unique tags; (2) six types of low-level features extracted from these images, including 64-D color histogram, 144-D color correlogram, 73-D edge direction histogram, 128-D(More)
We propose a multiple source domain adaptation method, referred to as Domain Adaptation Machine (DAM), to learn a robust decision function (referred to as <i>target classifier</i>) for label prediction of patterns from the target domain by leveraging a set of pre-computed classifiers (referred to as <i>auxiliary/source classifiers</i>) independently learned(More)
To learn the preferential visual attention given by humans to specific image content, we present an eye fixation database compiled from a pool of 758 images and 75 subjects. Eye fixations are an excellent modality to learn semantics-driven human understanding of images, which is vastly different from feature-driven approaches employed by saliency(More)
Modeling and recognizing landmarks at world-scale is a useful yet challenging task. There exists no readily available list of worldwide landmarks. Obtaining reliable visual models for each landmark can also pose problems, and efficiency is another challenge for such a large scale system. This paper leverages the vast amount of multimedia data on the Web,(More)
In this work, we investigate how to automatically reassign the manually annotated labels at the image-level to those contextually derived semantic regions. First, we propose a bi-layer sparse coding formulation for uncovering how an image or semantic region can be robustly reconstructed from the over-segmented image patches of an image set. We then harness(More)
The problem of recognizing actions in realistic videos is challenging yet absorbing owing to its great potentials in many practical applications. Most previous research is limited due to the use of simplified action databases under controlled environments or focus on excessively localized features without sufficiently encapsulating the spatio-temporal(More)
Most current image retrieval systems and commercial search engines use mainly text annotations to index and retrieve WWW images. This research explores the use of machine learning approaches to automatically annotate WWW images based on a predefined list of concepts by fusing evidences from image contents and their associated HTML text. One major practical(More)
We propose a supervised, two-phase framework to address the problem of paraphrase recognition (PR). Unlike most PR systems that focus on sentence similarity, our framework detects dissimilarities between sentences and makes its paraphrase judgment based on the significance of such dissimilarities. The ability to differentiate significant dissimilarities not(More)
Emotions can be evoked in humans by images. Most previous works on image emotion analysis mainly used the elements-of-art-based low-level visual features. However, these features are vulnerable and not invariant to the different arrangements of elements. In this paper, we investigate the concept of principles-of-art and its influence on image emotions.(More)