Discriminative spatial pyramid

@article{Harada2011DiscriminativeSP,
  title={Discriminative spatial pyramid},
  author={Tatsuya Harada and Y. Ushiku and Yuya Yamashita and Yasuo Kuniyoshi},
  journal={CVPR 2011},
  year={2011},
  pages={1617-1624}
}
Spatial Pyramid Representation (SPR) is a widely used method for embedding both global and local spatial information into a feature, and it shows good performance in terms of generic image recognition. In SPR, the image is divided into a sequence of increasingly finer grids on each pyramid level. Features are extracted from all of the grid cells and are concatenated to form one huge feature vector. As a result, expensive computational costs are required for both learning and testing. Moreover… Expand
Adaptive spatial partition learning for image classification
TLDR
This work proposes a data-driven approach to adaptively learn the discriminative spatial partitions corresponding to each class, and explore them for image classification, and adopts a discrim inative learning formulation with the group sparse constraint to find a sparse mapping from the feature representation to the label space. Expand
Beyond Spatial Pyramid Matching: Spatial Soft Voting for Image Classification
TLDR
A spatial soft voting method, in which the existence of the codes are expressed by a Gaussian function and the maps of the existence are sampled to form a feature vector that is "soft" both in the descriptor space and the image space. Expand
Beyond Spatial Pyramids: A New Feature Extraction Framework with Dense Spatial Sampling for Image Classification
TLDR
This work proposes a new learning algorithm, called Generalized Adaptive lp-norm Multiple Kernel Learning (GA-MKL), to learn an adapted robust classifier based on multiple base kernels constructed from image features and multiple sets of pre-learned classifiers of all the classes. Expand
Improved spatial pyramid matching for scene recognition
TLDR
A new type of spatial partitioning scheme and a modified pyramid matching kernel based on spatial pyramid matching (SPM) are proposed and a dense histogram of oriented gradients is used as a low-level visual descriptor. Expand
Spatial-Temporal Weighted Pyramid Using Spatial Orthogonal Pooling
TLDR
A novel interpretation that regards feature pooling as an orthogonal projection in the space of functions that maps the image space to the local feature space is proposed and a novel feature-pooling method that orthogonally projects the function form of local descriptors into thespace of low-degree polynomials is proposed. Expand
Discriminative Spatial Tree for Image Classification
  • Ye Xu, Xiaodong Yu, Tian Wang, Fuqiang Lu
  • Computer Science
  • 2017 IEEE International Conference on Smart Cloud (SmartCloud)
  • 2017
TLDR
Experimental results on two challenging datasets show that the simplified coding scheme leads to comparable results to some sophisticated ones, and discriminative ST can achieve better classification performance than spatial pyramid. Expand
Image Representation Learning by Deep Appearance and Spatial Coding
TLDR
A deep appearance and spatial coding model is proposed to build more optimal image representation for the classification task to address the discrimination loss in the local appearance coding and the lack of spatial information hinder its performance. Expand
An Enhancement to the Spatial Pyramid Matching for Image Classification and Retrieval
TLDR
This paper proposes a new weight function which is suitable for the rotation-invariant SPM structure and investigates three concentric ring partitioning schemes, successful in enhancing the effectiveness of SPM for image classification and retrieval. Expand
A Spatial-Pyramid Scene Categorization Algorithm based on Locality-aware Sparse Coding
TLDR
A novel model to learn and recognize scenes in nature by combining locality constrained sparse coding (LCSP), Spatial Pyramid Pooling, and linear SVM in end-to-end model is investigated. Expand
Ask the Image: Supervised Pooling to Preserve Feature Locality
TLDR
A standard Spatial Pyramid Representation which is commonly adopted to encode spatial information, with an appropriate Feature Space Representation favoring semantic information in an appropriate feature space is combined adaptively with Multiple Kernel Learning. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 32 REFERENCES
Global Gaussian approach for scene categorization using information geometry
TLDR
This paper defines some similarity measures of the distributions based on an information geometry framework and shows how this conceptually simple approach can provide a satisfactory performance, comparable to the bag-of-keypoints for scene classification tasks. Expand
Learning mid-level features for recognition
TLDR
This work seeks to establish the relative importance of each step of mid-level feature extraction through a comprehensive cross evaluation of several types of coding modules and pooling schemes and shows how to improve the best performing coding scheme by learning a supervised discriminative dictionary for sparse coding. Expand
Linear spatial pyramid matching using sparse coding for image classification
TLDR
An extension of the SPM method is developed, by generalizing vector quantization to sparse coding followed by multi-scale spatial max pooling, and a linear SPM kernel based on SIFT sparse codes is proposed, leading to state-of-the-art performance on several benchmarks by using a single type of descriptors. Expand
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
  • S. Lazebnik, C. Schmid, J. Ponce
  • Computer Science
  • 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)
  • 2006
TLDR
This paper presents a method for recognizing scene categories based on approximate global geometric correspondence that exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories. Expand
Fisher Kernels on Visual Vocabularies for Image Categorization
  • F. Perronnin, C. Dance
  • Mathematics, Computer Science
  • 2007 IEEE Conference on Computer Vision and Pattern Recognition
  • 2007
TLDR
This work shows that Fisher kernels can actually be understood as an extension of the popular bag-of-visterms, and proposes to apply this framework to image categorization where the input signals are images and where the underlying generative model is a visual vocabulary: a Gaussian mixture model which approximates the distribution of low-level features in images. Expand
Building Compact Local Pairwise Codebook with Joint Feature Space Clustering
TLDR
Experimental results on challenging datasets demonstrate that LPC outperforms the baselines and performs competitively against the state-of-the-art techniques in scene and object categorization tasks where a large number of categories need to be recognized. Expand
Learning Directional Local Pairwise Bases with Sparse Coding
TLDR
Directional Local Pairwise Bases (DLPB) is proposed that applies sparse coding to learn a compact set of bases capturing correlation between these descriptors, so to avoid the combinatorial explosion. Expand
Representing shape with a spatial pyramid kernel
TLDR
This work introduces a descriptor that represents local image shape and its spatial layout, together with a spatial pyramid kernel that is designed so that the shape correspondence between two images can be measured by the distance between their descriptors using the kernel. Expand
Improving the Fisher Kernel for Large-Scale Image Classification
TLDR
In an evaluation involving hundreds of thousands of training images, it is shown that classifiers learned on Flickr groups perform surprisingly well and that they can complement classifier learned on more carefully annotated datasets. Expand
Spatial-bag-of-features
TLDR
The proposed retrieval framework works well in image retrieval task owing to the encoding of geometric information of objects for capturing objects' spatial transformation, the supervised feature selection and combination strategy for enhancing the discriminative power, and the representation of bag-of-features for effective image matching and indexing for large scale image retrieval. Expand
...
1
2
3
4
...