• Corpus ID: 9789259

Scene parsing with Multiscale Feature Learning, Purity Trees, and Optimal Covers

@article{Farabet2012ScenePW,
  title={Scene parsing with Multiscale Feature Learning, Purity Trees, and Optimal Covers},
  author={Cl{\'e}ment Farabet and Camille Couprie and Laurent Najman and Yann LeCun},
  journal={ArXiv},
  year={2012},
  volume={abs/1202.2160}
}
Scene parsing consists in labeling each pixel in an image with the category of the object it belongs to. We propose a method that uses a multiscale convolutional network trained from raw pixels to extract dense feature vectors that encode regions of multiple sizes centered on each pixel. The method alleviates the need for engineered features. In parallel to feature extraction, a tree of segments is computed from a graph of pixel dissimilarities. The feature vectors associated with the segments… 
Learning Hierarchical Features for Scene Labeling
TLDR
A method that uses a multiscale convolutional network trained from raw pixels to extract dense feature vectors that encode regions of multiple sizes centered on each pixel, alleviates the need for engineered features, and produces a powerful representation that captures texture, shape, and contextual information.
Predicting Images using Convolutional Networks: Visual Scene Understanding with Pixel Maps
TLDR
This thesis develops a versatile Multi-Scale Convolutional Network that can be applied to diverse vision problems using simple adaptations, and applies it to predict depth at each pixel, surface normals and semantic labels, and develops a weighted nearest-neighbors labeling system applied to superpixels.
Multi-context Deep Convolutional Features and Exemplar-SVMs for Scene Parsing
TLDR
This paper proposes an approach that combines the multi-context deep convolutional features with exemplar-SVMs for scene parsing that can achieve good performance and applies it to two challenging datasets.
Semantic Road Segmentation via Multi-scale Ensembles of Learned Features
TLDR
An algorithm based on convolutional neural networks is proposed to learn local features from training data at different scales and resolutions using a weighted linear combination and its performance is similar to state---of---the---art methods using other sources of information such as depth, motion or stereo.
Image parsing with a wide range of classes and scene-level context
  • Marian George
  • Computer Science, Environmental Science
    2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2015
TLDR
A nonparametric scene parsing approach that improves the overall accuracy, as well as the coverage of foreground classes in scene images, and incorporates semantic context in the parsing process through global label costs.
Superparsing - Scalable Nonparametric Image Parsing with Superpixels
TLDR
This paper presents a simple and effective nonparametric approach to the problem of image parsing, or labeling image regions (in this case, superpixels produced by bottom-up segmentation) with their categories, and establishes a new benchmark for the problem.
Enhancing scene parsing by transferring structures via efficient low-rank graph matching
TLDR
A new graph matching model incorporating low-rank and Frobenius regularization is proposed, which not only guarantees an accurate solution, but also provides high optimization efficiency via an eigen-decomposition strategy.
Deep Deconvolutional Networks for Scene Parsing
TLDR
A novel network architecture is proposed that combines deep deconvolutional neural networks with CNNs and is employed for multi-patch training, which has the added advantage of having a training system that can be completely automated end-to-end with- out requiring any post-processing.
Adaptive Nonparametric Image Parsing
TLDR
This paper proposes a novel adaptive nonparametric approach that determines the sample-specific k for each test image, adaptively set to be the number of the fewest nearest superpixels that the images in the retrieval set can use to get the best category prediction.
Convolutional nets and watershed cuts for real-time semantic Labeling of RGBD videos
TLDR
An efficient video segmentation approach that computes temporally consistent pixels in a causal manner is proposed, filling the need for causal and real-time applications.
...
...

References

SHOWING 1-10 OF 31 REFERENCES
Superparsing - Scalable Nonparametric Image Parsing with Superpixels
TLDR
This paper presents a simple and effective nonparametric approach to the problem of image parsing, or labeling image regions (in this case, superpixels produced by bottom-up segmentation) with their categories, and establishes a new benchmark for the problem.
Associative hierarchical CRFs for object class image segmentation
TLDR
This work proposes a hierarchical random field model, that allows integration of features computed at different levels of the quantisation hierarchy, and evaluates its efficiency on some of the most challenging data-sets for object class segmentation, and shows it obtains state-of-the-art results.
Nonparametric scene parsing: Label transfer via dense scene alignment
TLDR
Compared to existing object recognition approaches that require training for each object category, the proposed nonparametric scene parsing system is easy to implement, has few parameters, and embeds contextual information naturally in the retrieval/alignment procedure.
Efficiently selecting regions for scene understanding
  • M. P. Kumar, D. Koller
  • Computer Science
    2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition
  • 2010
TLDR
This work designs an efficient approach for obtaining the best set of regions in terms of the energy function itself and demonstrates the usefulness of the algorithm on the task of scene segmentation and shows significant improvements over state of the art methods.
Decomposing a scene into geometric and semantically consistent regions
TLDR
A region-based model which combines appearance and scene geometry to automatically decompose a scene into semantically meaningful regions and which achieves state-of-the-art performance on the tasks of both multi-class image segmentation and geometric reasoning.
Stacked Hierarchical Labeling
TLDR
This work bypasses a global probabilistic model and instead directly train a hierarchical inference procedure inspired by the message passing mechanics of some approximate inference procedures in graphical models, which mitigates both the theoretical and empirical difficulties of learning Probabilistic models when exact inference is intractable.
Deep Convolutional Networks for Scene Parsing
TLDR
This work proposes a deep learning strategy for scene parsing, i.e. to asssign a class label to each pixel of an image, which does not need hand-crafted features, allows modeling more complex spatial dependencies and has a lower inference cost.
Maximin affinity learning of image segmentation
TLDR
This work presents the first machine learning algorithm for training a classifier to produce affinity graphs that are good in the sense of producing segmentations that directly minimize the Rand index, a well known segmentation performance measure.
What is the best multi-stage architecture for object recognition?
TLDR
It is shown that using non-linearities that include rectification and local contrast normalization is the single most important ingredient for good accuracy on object recognition benchmarks and that two stages of feature extraction yield better accuracy than one.
Object Recognition by Scene Alignment
TLDR
This work builds a probabilistic model to transfer the labels from the retrieval set to the input image, and demonstrates the effectiveness of this approach and study algorithm component contributions using held-out test sets from the LabelMe database.
...
...