Discovering objects and their location in images
@article{Sivic2005DiscoveringOA, title={Discovering objects and their location in images}, author={Josef Sivic and Bryan C. Russell and Alexei A. Efros and Andrew Zisserman and William T. Freeman}, journal={Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1}, year={2005}, volume={1}, pages={370-377 Vol. 1} }
We seek to discover the object categories depicted in a set of unlabelled images. We achieve this using a model developed in the statistical text literature: probabilistic latent semantic analysis (pLSA). In text analysis, this is used to discover topics in a corpus using the bag-of-words document representation. Here we treat object categories as topics, so that an image containing instances of several categories is modeled as a mixture of topics. The model is applied to images by using a…
1,150 Citations
Discovering objects and their location in images with Latent Dirichlet Allocation
- Computer Science
- 2005
This work describes the Latent Dirichlet Allocation (LDA) model and the Gibbs sampling solution for Bayesian inference, and shows how to form the visual analogue of text documents by vector quantizing SIFT-like regions at points found through two types of interest point detectors.
Unsupervised discovery of visual object class hierarchies
- Computer Science2008 IEEE Conference on Computer Vision and Pattern Recognition
- 2008
This work proposes to group visual objects using a multi-layer hierarchy tree that is based on common visual elements by adapting to the visual domain the generative hierarchical latent Dirichlet allocation (hLDA) model previously used for unsupervised discovery of topic hierarchies in text.
Decomposition, discovery and detection of visual categories using topic models
- Computer Science2008 IEEE Conference on Computer Vision and Pattern Recognition
- 2008
The proposed speed-ups make the system scale to large databases and improve the state-of-the-art in unsupervised learning as well as supervised detection on the challenging PASCALpsila06 multi-class detection tasks for several categories.
Semantic Annotation of Satellite Images Using Latent Dirichlet Allocation
- Computer ScienceIEEE Geoscience and Remote Sensing Letters
- 2010
This annotation task combines a step of supervised classification of patches of the large image and the integration of the spatial information between these patches, using the maximum-likelihood method.
A Thousand Words in a Scene
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2007
This paper presents a novel approach for visual scene modeling and classification, investigating the combined use of text modeling methods and local invariant features. Our work attempts to elucidate…
Unsupervised object category discovery via information bottleneck method
- Computer ScienceACM Multimedia
- 2010
The Information Bottleneck method is a promising technique for discovering the hidden semantic of images, and is superior to the state-of-the-art unsupervised object category discovery methods.
Graph-Based Object Class Discovery from Images with Multiple Objects
- Computer ScienceIDEAL
- 2014
A new unsupervised graph-based object discovery algorithm that treats images with multiple objects by clustering the local features without specifying the number of clusters and acquires object models as frequent subgraph structures defined by a set of co-occurring edges which describe the spatial relation between local features.
Latent Semantics Local Distribution for CRF-based Image Semantic Segmentation
- Computer ScienceBMVC
- 2009
This paper proposes a method that combines a region-based probabilistic graphical model that builds on the recent success of Conditional Random Fields (CRFs) in the problem of semantic segmentation, with a salient-points-based bagsof-words paradigm.
Discovering objects in images and videos
- Computer Science
- 2008
A novel way of scene analysis in images and videos using an appearance model and a motion model that provides appearance and location estimates of the objects of interest and provides a basis for higher level video content analysis tasks.
Unsupervised Image Categorization and Object Localization using Topic Models and Correspondences between Images
- Computer Science2007 IEEE 11th International Conference on Computer Vision
- 2007
A new approach is presented that employs correspondences between images to provide information about object configuration, which enhances the reliability of object localization and categorization, and shows improved categorization and localization performance on real and synthetic data.
References
SHOWING 1-10 OF 27 REFERENCES
Discovering object categories in image collections
- Computer Science
- 2005
Given a set of images containing multiple object categories, we seek to discover those categories and their image locations without supervision. We achieve this using generative models from the…
Matching Words and Pictures
- Computer ScienceJ. Mach. Learn. Res.
- 2003
A new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text, is presented, and a number of models for the joint distribution of image regions and words are developed, including several which explicitly learn the correspondence between regions and Words.
Hidden semantic concept discovery in region based image retrieval
- Computer ScienceCVPR 2004
- 2004
This work addresses content based image retrieval (CBIR), focusing on developing a hidden semantic concept discovery methodology to address effective semantics-intensive image retrieval. In our…
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
- Computer ScienceECCV
- 2002
This work shows how to cluster words that individually are difficult to predict into clusters that can be predicted well, and cannot predict the distinction between train and locomotive using the current set of features, but can predict the underlying concept.
A Bayesian hierarchical model for learning natural scene categories
- Computer Science2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05)
- 2005
This work proposes a novel approach to learn and recognize natural scene categories by representing the image of a scene by a collection of local regions, denoted as codewords obtained by unsupervised learning.
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories
- Computer Science2004 Conference on Computer Vision and Pattern Recognition Workshop
- 2004
Video Google: a text retrieval approach to object matching in videos
- Computer ScienceProceedings Ninth IEEE International Conference on Computer Vision
- 2003
We describe an approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video. The object is represented by a set of viewpoint…
Object class recognition by unsupervised scale-invariant learning
- Computer Science2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings.
- 2003
The flexible nature of the model is demonstrated by excellent results over a range of datasets including geometrically constrained classes (e.g. faces, cars) and flexible objects (such as animals).
Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope
- Computer ScienceInternational Journal of Computer Vision
- 2004
The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category.
Unsupervised Learning of Models for Recognition
- Computer ScienceECCV
- 2000
A method to learn object class models from unlabeled and unsegmented cluttered cluttered scenes for the purpose of visual object recognition that achieves very good classification results on human faces and rear views of cars.