Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

@article{Lazebnik2006BeyondBO,
  title={Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories},
  author={Svetlana Lazebnik and Cordelia Schmid and Jean Ponce},
  journal={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
  year={2006},
  volume={2},
  pages={2169-2178}
}
  • S. Lazebnik, C. Schmid, J. Ponce
  • Published 17 June 2006
  • Computer Science
  • 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)
This paper presents a method for recognizing scene categories based on approximate global geometric correspondence. [...] Key Method Specifically, our proposed method exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories. The spatial pyramid framework also offers insights into the success of several recently proposed image descriptions, including Torralba’s "gist" and Lowe’s SIFT descriptors.Expand
Scene classification based on spatial pyramid representation by superpixel lattices and contextual visual features
TLDR
A novel spatial pyramid representation scheme for recognizing scene category, which includes both local structural information and global spatial structural information, and which can achieve about 87.13% on a set of 15 categories of complex scenes.
Boosting Classifiers for Scene Category Recognition
TLDR
In order to recognize an unknown image as correctly as possible, this paper first employs multi-class support vector machine (SVM) classifiers to compute posterior probabilities from the individual PHOWs, and then adopt the boosting algorithm to combine the variants of SVM, each trained on a single PHOW, to obtain the improved estimate of the “final” posterior probabilities.
Enhanced Spatial Pyramid Matching Using Log-Polar-Based Image Subdivision and Representation
  • E. Zhang, M. Mayo
  • Computer Science
    2010 International Conference on Digital Image Computing: Techniques and Applications
  • 2010
TLDR
This paper proposes a new method to exploit spatial relationships between image features, based on binned log-polar grids, and shows that this approach improves the results on three diverse datasets over the SPM technique.
Improved spatial pyramid matching for scene recognition
TLDR
A new type of spatial partitioning scheme and a modified pyramid matching kernel based on spatial pyramid matching (SPM) are proposed and a dense histogram of oriented gradients is used as a low-level visual descriptor.
Simple object recognition based on spatial relations and visual features represented using irregular pyramids
TLDR
This work proposes a graph matching scheme that involves color, texture and shape features along with spatial descriptors to represent topological and orientation/directional relationships—which are obtained by means of combinatorial pyramids—in order to identify similar objects from a database.
A Hierarchical Feature Extraction Scheme with Special Vocabulary Generation for Natural Scene Classification
TLDR
A novel approach to recognize scene categories by extracting appearance features from an image similar to a pyramid and visual words are formed by performing K-means clustering from each category and concatenated to form a dictionary distinguish to traditional BOW.
Saliency moments for image categorization
TLDR
Saliency Moments is presented, a new, holistic descriptor for image recognition inspired by two biological vision principles: the gist perception and the selective visual attention, that outperforms the traditional global features on scene and object categorization, for a variety of challenging datasets.
Compact, Adaptive and Discriminative Spatial Pyramid for Improved Scene and Object Classification
TLDR
This thesis explores the problem of obtaining compact, adaptive, yet informative spatial image representations in the context of object and scene classification tasks with a novel SP compression technique that works on two levels; compressing the least informative spatial pyramid features, and a new texture descriptor that represents local image texture and its spatial layout.
A Spatial-Pyramid Scene Categorization Algorithm based on Locality-aware Sparse Coding
TLDR
A novel model to learn and recognize scenes in nature by combining locality constrained sparse coding (LCSP), Spatial Pyramid Pooling, and linear SVM in end-to-end model is investigated.
Natural scene recognition using weighted histograms of gradient orientation descriptor
The automatic recognition of the contents of a scene is an important issue in the computer vision field. Though considerable progress has been made, the complexity of scenes remains an important
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 38 REFERENCES
Categorizing Nine Visual Classes using Local Appearance Descriptors
TLDR
A thorough evaluation clearly demonstrates that the bag of keypoints method is robust to background clutter and produces good categorization accuracy even without exploiting geometric information.
Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope
  • A. Oliva, A. Torralba
  • Mathematics, Computer Science
    International Journal of Computer Vision
  • 2004
TLDR
The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category.
Local Features and Kernels for Classification of Texture and Object Categories: An In-Depth Study
Recently, methods based on local image features have shown promise for texture and object recognition tasks. This paper presents a large-scale evaluation of an approach that represents images as
A maximum entropy framework for part-based texture and object recognition
TLDR
A probabilistic part-based approach for texture and object recognition using a discriminative maximum entropy framework to learn the posterior distribution of the class label given the occurrences of parts from the dictionary in the training set.
Modeling scenes with local descriptors and latent aspects
TLDR
Probabilistic latent semantic analysis generates a compact scene representation, discriminative for accurate classification, and significantly more robust when less training data are available, and the ability of PLSA to automatically extract visually meaningful aspects is exploited to propose new algorithms for aspect-based image ranking and context-sensitive image segmentation.
Recognition without Correspondence using Multidimensional Receptive Field Histograms
TLDR
This article presents a technique where appearances of objects are represented by the joint statistics of such local neighborhood operators, which represents a new class of appearance based techniques for computer vision.
Indoor-outdoor image classification
  • M. Szummer, Rosalind W. Picard
  • Computer Science
    Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database
  • 1998
TLDR
This work systematically studied the features of: histograms in the Ohta color space; multiresolution, simultaneous autoregressive model parameters; and coefficients of a shift-invariant DCT to show how high-level scene properties can be inferred from classification of low-level image features.
Discovering objects and their location in images
TLDR
This work treats object categories as topics, so that an image containing instances of several categories is modeled as a mixture of topics, and develops a model developed in the statistical text literature: probabilistic latent semantic analysis (pLSA).
Shape matching and object recognition using low distortion correspondences
TLDR
This work approaches recognition in the framework of deformable shape matching, relying on a new algorithm for finding correspondences between feature points, and shows results for localizing frontal and profile faces that are comparable to special purpose approaches tuned to faces.
Pyramid Match Kernels: Discriminative Classification with Sets of Image Features (version 2)
TLDR
A new fast kernel function is presented which maps unordered feature sets to multi-resolution histograms and computes a weighted histogram intersection in this space and is shown to be positive-definite, making it valid for use in learning algorithms whose optimal solutions are guaranteed only for Mercer kernels.
...
1
2
3
4
...