• Publications
  • Influence
SUN database: Large-scale scene recognition from abbey to zoo
TLDR
This paper proposes the extensive Scene UNderstanding (SUN) database that contains 899 categories and 130,519 images and uses 397 well-sampled categories to evaluate numerous state-of-the-art algorithms for scene recognition and establish new bounds of performance.
Learning to predict where humans look
TLDR
This paper collects eye tracking data of 15 viewers on 1003 images and uses this database as training and testing examples to learn a model of saliency based on low, middle and high-level image features.
Recognizing scene viewpoint using panoramic place representation
TLDR
A database of 360° panoramic images organized into 26 place categories is constructed, and the problem of scene viewpoint recognition is introduced, to classify the type of place shown in a photo, and also recognize the observer's viewpoint within that category of place.
Modelling search for people in 900 scenes: A combined source model of eye guidance
TLDR
This work puts forth a benchmark for computational models of search in real world scenes by recording observers’ eye movements as they performed a search task (person detection) in 912 outdoor scenes and finding that observers were highly consistent in the regions fixated during search.
SUN Database: Exploring a Large Collection of Scene Categories
TLDR
The Scene Understanding database is proposed, a nearly exhaustive collection of scenes categorized at the same level of specificity as human discourse that contains 908 distinct scene categories and 131,072 images.
TurkerGaze: Crowdsourcing Saliency with Webcam based Eye Tracking
TLDR
This paper introduces a webcam-based gaze tracking system that supports large-scale, crowdsourced eye tracking deployed on Amazon Mechanical Turk (AMTurk), and builds a saliency dataset for a large number of natural images.
Rethinking the Role of Top-Down Attention in Vision: Effects Attributable to a Lossy Representation in Peripheral Vision
TLDR
It is proposed that under normal viewing conditions, the main processes of feature binding and perception proceed largely independently of top-down selective attention, and the texture tiling model (TTM) represents images in terms of a fixed set of “texture” statistics computed over local pooling regions that tile the visual input.
A general account of peripheral encoding also predicts scene perception performance.
TLDR
It is shown that an encoding model previously shown to predict performance in crowded object recognition and visual search might also underlie the performance on those tasks, and that this model does a reasonably good job of predicting performance on these scene tasks, suggesting that scene tasks may not be so special.
CB Database: A change blindness database for objects in natural indoor scenes
TLDR
The Change Blindness (CB) Database with object changes in 130 colored images of natural indoor scenes is introduced, intended to provide researchers with a stimulus set of natural scenes with defined stimulus parameters that can be used for a wide range of experiments.
Estimating scene typicality from human ratings and image features
TLDR
The goal of the current study is to determine the prototypical exemplars that best represent each visual scene category; and to evaluate the performances of state-of-the-art global features algorithms at classifying different types of exemplars.
...
1
2
3
4
5
...