Efficient Multi-cue Scene Segmentation

@inproceedings{Scharwchter2013EfficientMS,
  title={Efficient Multi-cue Scene Segmentation},
  author={Timo Scharw{\"a}chter and Markus Enzweiler and Uwe Franke and Stefan Roth},
  booktitle={GCPR},
  year={2013}
}
This paper presents a novel multi-cue framework for scene segmentation, involving a combination of appearance (grayscale images) and depth cues (dense stereo vision). An efficient 3D environment model is utilized to create a small set of meaningful free-form region hypotheses for object location and extent. Those regions are subsequently categorized into several object classes using an extended multi-cue bag-of-features pipeline. For that, we augment grayscale bag-of-features by bag-of-depth… 
Object-Level Priors for Stixel Generation
TLDR
This paper presents a principled way to additionally integrate top-down prior information about object location and shape that arises from independent system modules, ranging from geometric cues up to highly confident object detections, in a consistent scene representation for traffic scenarios.
PedCut: an iterative framework for pedestrian segmentation combining shape models and multiple data cues
TLDR
An iterative, EM-like framework for accurate pedestrian segmentation, combining generative shape models and multiple data cues is presented, suggesting that this method outperforms the state-of-the-art.
Stixel based scene understanding for autonomous vehicles
TLDR
A stereo vision based obstacle detection and scene segmentation algorithm appropriate for autonomous vehicles is proposed based on an innovative extension of the Stixel world, which neglects computing a disparity map.
Layered Interpretation of Street View Images
TLDR
This work proposes a 4-layer street view model, a compact representation over the recently proposed stix-mantics model, that outperforms other competing approaches in Daimler urban scene segmentation dataset.
Geodesic pixel neighborhoods for 2D and 3D scene understanding
Stixmantics: A Medium-Level Model for Real-Time Semantic Scene Understanding
TLDR
Stixmantics, a novel medium-level scene representation for real-time visual semantic scene understanding that encoded relevant scene structure, motion and object class information is encoded using so-called Stixels as primitive elements.
Semantic Urban Maps
TLDR
A novel region based 3D semantic mapping method for urban scenes that labels the regions of segmented images into a set of geometric and semantic classes simultaneously by employing a Markov Random Field based classification framework.
Simultaneous Transparent and Non-Transparent Object Segmentation With Multispectral Scenes
TLDR
This research proposes a new semantic segmentation method having a three-stream structure, focusing on the difference in the transmission characteristics, and constructed a new dataset called “coaxials” for the visible and infrared coaxial dataset and demonstrated that it can obtain better segmentation performance compared with the conventional method.
Ground segmentation and occupancy grid generation using probability fields
TLDR
A novel technique for segmenting the ground plane and at the same time estimating the occupancy probability of each point in a scene is proposed, which requires minimal initialization and is independent of the stereo sensor characteristics as well as the parameters of the disparity algorithm.
The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes
TLDR
The Mapillary Vistas Dataset is a novel, large-scale street-level image dataset containing 25000 high-resolution images annotated into 66 object categories with additional, instance-specific labels for 37 classes, aiming to significantly further the development of state-of-the-art methods for visual road-scene understanding.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 35 REFERENCES
Semantic segmentation of street scenes by superpixel co-occurrence and 3D geometry
  • B. Micusík, J. Kosecka
  • Computer Science
    2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops
  • 2009
TLDR
The main novelty of this generative approach is the introduction of an explicit model of spatial co-occurrence of visual words associated with super-pixels and utilization of appearance, geometry and contextual cues in a probabilistic framework.
Combining Appearance and Structure from Motion Features for Road Scene Understanding
TLDR
A framework for pixel-wise object segmentation of road scenes that combines motion and appearance features that is designed to handle street-level imagery such as that on Google Street View and Microsoft Bing Maps is presented.
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
  • S. Lazebnik, C. Schmid, J. Ponce
  • Computer Science
    2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)
  • 2006
TLDR
This paper presents a method for recognizing scene categories based on approximate global geometric correspondence that exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories.
Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images
TLDR
This work proposes algorithms for object boundary detection and hierarchical segmentation that generalize the gPb-ucm approach of [2] by making effective use of depth information and shows how this contextual information in turn improves object recognition.
A textured object recognition pipeline for color and depth image data
We present an object recognition system which leverages the additional sensing and calibration information available in a robotics setting together with large amounts of training data to build high
Semantic segmentation using regions and parts
TLDR
A novel design for region-based object detectors that integrates efficiently top-down information from scanning-windows part models and global appearance cues is proposed that produces class-specific scores for bottom-up regions, and then aggregate the votes of multiple overlapping candidates through pixel classification.
Semantic Segmentation of Urban Scenes Using Dense Depth Maps
TLDR
The result shows that only using dense depth information, this framework for semantic scene parsing and object recognition based on dense depth maps can achieve overall better accurate segmentation and recognition than that from sparse 3D features or appearance, advancing state-of-the-art performance.
Towards a Global Optimal Multi-Layer Stixel Representation of Dense 3D Data
TLDR
This work presents a novel reconstruction of stereo vision data that allows to incorporate real-world constraints such as perspective ordering and delivers an optimal segmentation with respect to freespace and obstacle information.
Segmentation-based multi-class semantic object detection
TLDR
This work uses image segments as primitives to extract robust features and train detection models for a predefined set of categories and proposes two methods for enhancing the segments classification based on the fusion of the classification results obtained with the different segmentations.
Segmentation and Recognition Using Structure from Motion Point Clouds
TLDR
This work proposes an algorithm for semantic segmentation based on 3D point clouds derived from ego-motion that works well on sparse, noisy point clouds, and unlike existing approaches, does not need appearance-based descriptors.
...
1
2
3
4
...