Learn More
Automatically generating a natural language description of an image has attracted interests recently both because of its importance in practical applications and because it connects two major artificial intelligence fields: computer vision and natural language processing. Existing approaches are either top-down, which start from a gist of an image and(More)
Cropping is one of the most common tasks in image editing for improving the aesthetic quality of a photograph. In this paper, we propose a new, aesthetic photo cropping system which combines three models: <i>visual composition, boundary simplicity</i>, and <i>content preservation.</i> The visual composition model measures the quality of composition for a(More)
In this paper, we propose an effective heuristic based on the framework of the shuffled frog-leaping algorithm (SFLA) for solving the resource-constrained project scheduling problem (RCPSP). We encode the virtual frog as the extended activity list (EAL) and decode it by the SFLA-specific serial schedule generation scheme (SSSGS). The initial population is(More)
In this paper we address the problem of object-class retrieval in large image data sets: given a small set of training examples defining a visual category, the objective is to efficiently retrieve images of the same class from a large database. We propose two contrasting retrieval schemes achieving good accuracy and high efficiency. The first exploits(More)
Recently, there have been several promising methods to generate realistic imagery from deep convolutional networks. These methods sidestep the traditional computer graphics rendering pipeline and instead generate imagery at the pixel level by learning from large collections of photos (e.g. faces or bedrooms). However, these methods are of limited utility(More)
Visual-semantic embedding models have been recently proposed and shown to be effective for image classification and zero-shot learning, by mapping images into a continuous semantic label space. Although several approaches have been proposed for single-label embedding tasks, handling images with multiple labels (which is a more general setting) still remains(More)
Many standard computer vision datasets exhibit biases due to a variety of sources including illumination condition, imaging system, and preference of dataset collectors. Biases like these can have downstream effects in the use of vision datasets in the construction of generalizable techniques, especially for the goal of the creation of a classification(More)