Corpus ID: 237532365

Lifting 2D Object Locations to 3D by Discounting LiDAR Outliers across Objects and Views

  title={Lifting 2D Object Locations to 3D by Discounting LiDAR Outliers across Objects and Views},
  author={Robert McCraith and Eldar Insafudinov and Luk{\'a}s Neumann and Andrea Vedaldi},
We present a system for automatic converting of 2D mask object predictions and raw LiDAR point clouds into full 3D bounding boxes of objects. Because the LiDAR point clouds are partial, directly fitting bounding boxes to the point clouds is meaningless. Instead, we suggest that obtaining good results requires sharing information between all objects in the dataset jointly, over multiple frames. We then make three improvements to the baseline. First, we address ambiguities in predicting the… Expand

Figures and Tables from this paper


Frustum PointNets for 3D Object Detection from RGB-D Data
This work directly operates on raw point clouds by popping up RGBD scans and leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects. Expand
Weakly Supervised 3D Object Detection from Point Clouds
VS3D, a framework for weakly supervised 3D object detection from point clouds without using any ground truth 3D bounding box for training, is proposed and an unsupervised 3D proposal module that generates object proposals by leveraging normalized point cloud densities is introduced. Expand
3D Bounding Box Estimation Using Deep Learning and Geometry
Although conceptually simple, this method outperforms more complex and computationally expensive approaches that leverage semantic segmentation, instance level segmentation and flat ground priors and produces state of the art results for 3D viewpoint estimation on the Pascal 3D+ dataset. Expand
Autolabeling 3D Objects With Differentiable Rendering of SDF Shape Priors
An automatic annotation pipeline to recover 9D cuboids and 3D shapes from pre-trained off-the-shelf 2D detectors and sparse LIDAR data is presented and a curriculum learning strategy is proposed, iteratively retraining on samples of increasing difficulty in subsequent self-improving annotation rounds. Expand
Weakly Supervised 3D Object Detection from Lidar Point Cloud
This work proposes a weakly supervised approach for 3D object detection, only requiring a small set of weakly annotated scenes, associated with a few precisely labeled object instances, achieved by a two-stage architecture design. Expand
Multi-view 3D Object Detection Network for Autonomous Driving
This paper proposes Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes and designs a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths. Expand
Teaching 3D geometry to deformable part models
This paper extends the successful discriminatively trained deformable part models to include both estimates of viewpoint and 3D parts that are consistent across viewpoints, and experimentally verify that adding 3D geometric information comes at minimal performance loss w.r.t. Expand
VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection
  • Yin Zhou, Oncel Tuzel
  • Computer Science
  • 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
VoxelNet is proposed, a generic 3D detection network that unifies feature extraction and bounding box prediction into a single stage, end-to-end trainable deep network and learns an effective discriminative representation of objects with various geometries, leading to encouraging results in3D detection of pedestrians and cyclists. Expand
Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks
This paper proposes a computationally efficient approach to detecting objects natively in 3D point clouds using convolutional neural networks (CNNs) by leveraging a feature-centric voting scheme to implement novel convolutionan layers which explicitly exploit the sparsity encountered in the input. Expand
Learning Monocular 3D Vehicle Detection Without 3D Bounding Box Labels
This chapter proposes a network architecture and training procedure for learning monocular 3D object detection without 3D bounding box labels, and evaluates the proposed algorithm on the real-world KITTI dataset to achieve promising performance in comparison to state-of-the-art methods and superior performance to conventional baseline methods. Expand