• Publications
  • Influence
Learning Deep Features for Discriminative Localization
In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network (CNN) to have remarkable localization abilityExpand
Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope
  • A. Oliva, A. Torralba
  • Mathematics, Computer Science
  • International Journal of Computer Vision
  • 1 May 2001
The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category. Expand
Learning Deep Features for Scene Recognition using Places Database
A new scene-centric database called Places with over 7 million labeled pictures of scenes is introduced with new methods to compare the density and diversity of image datasets and it is shown that Places is as dense as other scene datasets and has more diversity. Expand
Spectral Hashing
The problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard and a spectral method is obtained whose solutions are simply a subset of thresholded eigenvectors of the graph Laplacian. Expand
Learning to predict where humans look
This paper collects eye tracking data of 15 viewers on 1003 images and uses this database as training and testing examples to learn a model of saliency based on low, middle and high-level image features. Expand
SUN database: Large-scale scene recognition from abbey to zoo
This paper proposes the extensive Scene UNderstanding (SUN) database that contains 899 categories and 130,519 images and uses 397 well-sampled categories to evaluate numerous state-of-the-art algorithms for scene recognition and establish new bounds of performance. Expand
Places: A 10 Million Image Database for Scene Recognition
The Places Database is described, a repository of 10 million scene photographs, labeled with scene semantic categories, comprising a large and diverse list of the types of environments encountered in the world, using the state-of-the-art Convolutional Neural Networks as baselines, that significantly outperform the previous approaches. Expand
Skip-Thought Vectors
We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct theExpand
LabelMe: A Database and Web-Based Tool for Image Annotation
A web-based tool that allows easy image annotation and instant sharing of such annotations is developed and a large dataset that spans many object categories, often containing multiple instances over a wide variety of images is collected. Expand
SIFT Flow: Dense Correspondence across Scenes and Its Applications
SIFT flow is proposed, a method to align an image to its nearest neighbors in a large image corpus containing a variety of scenes, where image information is transferred from the nearest neighbors to a query image according to the dense scene correspondence. Expand