Segmentation and Recognition Using Structure from Motion Point Clouds

  title={Segmentation and Recognition Using Structure from Motion Point Clouds},
  author={Gabriel J. Brostow and Jamie Shotton and Julien Fauqueur and Roberto Cipolla},
We propose an algorithm for semantic segmentation based on 3D point clouds derived from ego-motion. [] Key Method Our method works well on sparse, noisy point clouds, and unlike existing approaches, does not need appearance-based descriptors. Experiments were performed on a challenging new video database containing sequences filmed from a moving car in daylight and at dusk. The results confirm that indeed, accurate segmentation and recognition are possible using only motion and 3D world structure. Further…

Combining Appearance and Structure from Motion Features for Road Scene Understanding

A framework for pixel-wise object segmentation of road scenes that combines motion and appearance features that is designed to handle street-level imagery such as that on Google Street View and Microsoft Bing Maps is presented.

Integrating Motion and Segmentation for Road Scene Labeling

A new integration framework using an SfM module and a bag of textons method for road scene labeling and a pairwise conditional random field (CRF) model is presented, which can revolutionize 3D scene understanding systems used for vehicle environment perception.

Semantic Segmentation of Urban Scenes Using Dense Depth Maps

The result shows that only using dense depth information, this framework for semantic scene parsing and object recognition based on dense depth maps can achieve overall better accurate segmentation and recognition than that from sparse 3D features or appearance, advancing state-of-the-art performance.

Semantic structure from motion with points, regions, and objects

This paper's framework is capable of accurately estimating pose and location of objects, regions, and points in the 3D scene; it recognizes objects and regions more accurately than state-of-the-art single image recognition methods.

Nonparametric semantic segmentation for 3D street scenes

  • Hu HeB. Upcroft
  • Computer Science
    2013 IEEE/RSJ International Conference on Intelligent Robots and Systems
  • 2013
This paper uses stereo image pairs collected from cameras mounted on a moving car to produce dense depth maps which are combined into a global 3D reconstruction using camera poses from stereo visual odometry and the resultant 3D semantic model is improved with the consideration of moving objects in the scene.

Road scene segmentation via fusing camera and lidar data

This paper presents an approach for pixel-wise object segmentation for road scenes based on the integration of a color image and an aligned 3D point cloud, which incorporates the learned prior models together with hard constraints placed on the registered pixels and pairwise spatial constraints to achieve final results.

Semantic structure from motion with object and point interactions

We propose a new method for jointly detecting objects and recovering the geometry of the scene (camera pose, object and scene point 3D locations) from multiple semi-calibrated images (camera internal

Joint Semantic Segmentation and 3D Reconstruction from Monocular Video

Improved 3D structure and temporally consistent semantic segmentation for difficult, large scale, forward moving monocular image sequences is demonstrated.

Road scene labeling using SfM module and 3D bag of textons

A new integration framework using an SfM module and a bag of textons method for road scene labeling and a pairwise conditional random field model is presented, which can revolutionize 3D scene understanding systems used for vehicle environment perception.

Multiple view semantic segmentation for street view images

This work proposes a simple but powerful multi-view semantic segmentation framework for images captured by a camera mounted on a car driving along streets and proposes a powerful approach within the same framework to enable large-scale labeling in both the 3D space and 2D images.



3D LayoutCRF for Multi-View Object Class Recognition and Segmentation

An approach to accurately detect and segment partially occluded objects in various viewpoints and scales is introduced and a novel framework for combining object-level descriptions with pixel-level appearance, boundary, and occlusion reasoning is introduced.

Dynamic 3D Scene Analysis from a Moving Vehicle

A system that integrates fully automatic scene geometry estimation, 2D object detection, 3D localization, trajectory estimation, and tracking for dynamic scene interpretation from a moving vehicle and demonstrates the performance of this integrated system on challenging real-world data showing car passages through crowded city areas.

3D generic object categorization, localization and pose estimation

This work proposes a novel and robust model to represent and learn generic 3D object categories, and proposes a framework in which learning is done via minimal supervision compared to previous works.

Detecting Carried Objects in Short Video Sequences

A new method for detecting objects such as bags carried by pedestrians depicted in short video sequences by comparing the temporal templates against view-specific exemplars generated offline for unencumbered pedestrians, which yields a segmentation of carried objects using the MAP solution.

3D Model based Object Class Detection in An Arbitrary View

A novel object class detection method based on 3D object modeling that establishes spatial connections between multiple 2D training views by mapping them directly to the surface of 3D model.

TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation

A new approach to learning a discriminative model of object classes, incorporating appearance, shape and context information efficiently, is proposed, which is used for automatic visual recognition and semantic segmentation of photographs.

Flexible Object Models for Category-Level 3D Object Recognition

This work proposes a novel framework for visual object recognition where object classes are represented by assemblies of partial surface models obeying loose local geometric constraints, and it outperforms the state-of-the-art algorithms for object detection and localization.

Detection and tracking of moving objects from a moving platform in presence of strong parallax

A robust parallax filtering scheme is proposed to accumulate the geometric constraint errors within a sliding window and estimate a likelihood map for pixel classification, which is integrated into the tracking framework based on the spatio-temporal joint probability data association filter (JPDAF).

The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects

  • J. WinnJ. Shotton
  • Computer Science
    2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)
  • 2006
This paper addresses the problem of detecting and segmenting partially occluded objects of a known category by defining a part labelling which densely covers the object and imposing asymmetric local spatial constraints on these labels to ensure the consistent layout of parts whilst allowing for object deformation.

Geometric context from a single image

This work shows that it can estimate the coarse geometric properties of a scene by learning appearance-based models of geometric classes, even in cluttered natural scenes, and provides a multiple-hypothesis framework for robustly estimating scene structure from a single image and obtaining confidences for each geometric label.