Learning hierarchical models of scenes, objects, and parts

@article{Sudderth2005LearningHM,
  title={Learning hierarchical models of scenes, objects, and parts},
  author={Erik B. Sudderth and Antonio Torralba and William T. Freeman and Alan S. Willsky},
  journal={Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1},
  year={2005},
  volume={2},
  pages={1331-1338 Vol. 2}
}
We describe a hierarchical probabilistic model for the detection and recognition of objects in cluttered, natural scenes. The model is based on a set of parts which describe the expected appearance and position, in an object centered coordinate frame, of features detected by a low-level interest operator. Each object category then has its own distribution over these parts, which are shared between objects. We learn the parameters of this model via a Gibbs sampler which uses the graphical model… 
Describing Visual Scenes Using Transformed Objects and Parts
TLDR
This work develops hierarchical, probabilistic models for objects, the parts composing them, and the visual scenes surrounding them and proposes nonparametric models which use Dirichlet processes to automatically learn the number of parts underlying each object category, and objects composing each scene.
Probabilistic modeling of scenes using object frames
TLDR
A probabilistic scene model using object frames, each of which is a group of co-occurring objects with fixed spatial relations that captures the spatial relationship among groups of objects to improve the performance of object recognition and scene retrieval.
Inference and Learning with Hierarchical Shape Models
TLDR
This work automates the decomposition of an object category into parts and contours, and discriminatively learn the cost function that drives the matching of the object to the image using Multiple Instance Learning.
Describing Visual Scenes using Transformed Dirichlet Processes
TLDR
This work develops a hierarchical probabilistic model for the spatial structure of visual scenes based on the transformed Dirichlet process, a novel extension of the hierarchical DP in which a set of stochastically transformed mixture components are shared between multiple groups of data.
Depth from Familiar Objects: A Hierarchical Model for 3D Scenes
TLDR
An integrated, probabilistic model for the appearance and three-dimensional geometry of cluttered scenes and a robust likelihood model accounts for outliers in matched stereo features, allowing effective learning of 3D object structure from partial 2D segmentations.
Multiple Object Class Detection with a Generative Model
TLDR
The performance of the proposed multi-object class detection approach is competitive to state of the art approaches dedicated to a single object class recognition problem.
Learning models of object structure
TLDR
Experiments on images of furniture objects such as tables and chairs suggest that this is an effective approach for learning models that encode simple representations of category geometry and the statistics thereof, and support inferring both category and geometry on held out single view images.
Unsupervised Modeling of Objects and Their Hierarchical Contextual Interactions
TLDR
H hierarchical semantics of objects (hSO) is introduced that captures this hierarchical grouping of objects and is proposed for the unsupervised learning of the hSO from a collection of images of a particular scene.
Unsupervised learning of hierarchical spatial structures in images
TLDR
This work presents a unified approach to unsupervised learning of hierarchical spatial structures from a collection of images, using a hierarchical rule-based model capturing spatial patterns, where each rule is represented by a star-graph.
Unsupervised learning of hierarchical spatial structures in images
TLDR
This work presents a unified approach to unsupervised learning of hierarchical spatial structures from a collection of images, using a hierarchical rule-based model capturing spatial patterns, where each rule is represented by a star-graph.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 29 REFERENCES
A Bayesian hierarchical model for learning natural scene categories
  • Li Fei-Fei, P. Perona
  • Computer Science
    2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05)
  • 2005
TLDR
This work proposes a novel approach to learn and recognize natural scene categories by representing the image of a scene by a collection of local regions, denoted as codewords obtained by unsupervised learning.
A Bayesian approach to unsupervised one-shot learning of object categories
TLDR
This work presents a method for learning object categories from just a few images, based on incorporating "generic" knowledge which may be obtained from previously learnt models of unrelated categories, in a variational Bayesian framework.
Object Class Recognition with Many Local Features
  • S. Helmer, D. Lowe
  • Computer Science
    2004 Conference on Computer Vision and Pattern Recognition Workshop
  • 2004
TLDR
A learning method that overcomes the difficulty of matching local features in the image to parts in the model by adding new parts to the model incrementally, using the Maximum-Likelihood framework.
Contextual Models for Object Detection Using Boosted Random Fields
TLDR
This work introduces Boosted Random Fields (BRFs), which uses Boosting to learn the graph structure and local evidence of a conditional random field (CRF) and applies it to detect stuff and things in office and street scenes.
Discovering object categories in image collections
Given a set of images containing multiple object categories, we seek to discover those categories and their image locations without supervision. We achieve this using generative models from the
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories
TLDR
The incremental algorithm is compared experimentally to an earlier batch Bayesian algorithm, as well as to one based on maximum-likelihood, which have comparable classification performance on small training sets, but incremental learning is significantly faster, making real-time learning feasible.
Image parsing: unifying segmentation, detection, and recognition
TLDR
A general framework for parsing images into regions and objects, which makes use of bottom-up proposals combined with top-down generative models using the data driven Markov chain Monte Carlo algorithm, which is guaranteed to converge to the optimal estimate asymptotically.
Sharing features: efficient boosting procedures for multiclass object detection
  • A. Torralba, K. Murphy, W. Freeman
  • Computer Science
    Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004.
  • 2004
TLDR
A multi-class boosting procedure (joint boosting) is presented that reduces both the computational and sample complexity, by finding common features that can be shared across the classes.
Matching Words and Pictures
TLDR
A new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text, is presented, and a number of models for the joint distribution of image regions and words are developed, including several which explicitly learn the correspondence between regions and Words.
Distinctive Image Features from Scale-Invariant Keypoints
  • D. Lowe
  • Computer Science
    International Journal of Computer Vision
  • 2004
TLDR
This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
...
1
2
3
...