Multi-view 3D Object Detection Network for Autonomous Driving

@article{Chen2017Multiview3O,
  title={Multi-view 3D Object Detection Network for Autonomous Driving},
  author={Xiaozhi Chen and Huimin Ma and Ji Wan and Bo Li and Tian Xia},
  journal={2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2017},
  pages={6526-6534}
}
  • Xiaozhi Chen, Huimin Ma, Tian Xia
  • Published 23 November 2016
  • Computer Science
  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
This paper aims at high-accuracy 3D object detection in autonomous driving scenario. [] Key Method The network is composed of two subnetworks: one for 3D object proposal generation and another for multi-view feature fusion. The proposal network generates 3D candidate boxes efficiently from the birds eye view representation of 3D point cloud. We design a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths. Experiments on…

Figures and Tables from this paper

Multi-View Adaptive Fusion Network for 3D Object Detection
TLDR
An attentive pointwise fusion (APF) module to estimate the importance of the three sources with attention mechanisms that can achieve adaptive fusion of multi-view features in a pointwise manner is proposed and an end-to-end learnable network named MVAF-Net is designed to integrate these two components.
Joint 3D Proposal Generation and Object Detection from View Aggregation
TLDR
This work presents AVOD, an Aggregate View Object Detection network for autonomous driving scenarios that uses LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network.
Integrating State-of-the-Art CNNs for Multi-Sensor 3D Vehicle Detection in Real Autonomous Driving Environments
TLDR
Two new approaches to detect surrounding vehicles in 3D urban driving scenes and their corresponding Bird’s Eye View (BEV) are presented, removing the need to train on a specific dataset and showing a good capability of generalization to any domain, a key point for self-driving systems.
RoIFusion: 3D Object Detection From LiDAR and Vision
TLDR
A deep neural network architecture is proposed to efficiently fuse the multi-modality features for 3D object detection by leveraging the advantages of LIDAR and camera sensors by aggregating a small set of 3D Region of Interests (RoIs) in the point clouds with the corresponding 2D RoIs in the images.
Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving
TLDR
This paper proposes a monocular 3D object detection framework in the domain of autonomous driving, and proposes a multi-modal feature fusion module to embed the complementary RGB cue into the generated point clouds representation.
Multi-level Fusion Network for 3D Object Detection from Camera and LiDAR Data
TLDR
A two-stage 3D object detection system, which takes input from the camera and LiDAR data, and outputs the localization and category of the 3D bounding box, using a novel feature extractor to learn the full-resolution scale features while keeping the computation speed coupled with a multimodal fusion Region Proposal Network (RPN) architecture.
M3D-RPN: Monocular 3D Region Proposal Network for Object Detection
TLDR
M3D-RPN is able to significantly improve the performance of both monocular 3D Object Detection and Bird's Eye View tasks within the KITTI urban autonomous driving dataset, while efficiently using a shared multi-class model.
Cross-Modality 3D Object Detection
TLDR
This paper presents a novel two-stage multi-modal fusion network for 3D object detection, taking both binocular images and raw point clouds as input, and proposes to use pseudo LiDAR points from stereo matching as a data augmentation method to densify the LiDar points.
Improving Deep Multi-modal 3D Object Detection for Autonomous Driving
  • R. Khamsehashari, K. Schill
  • Computer Science
    2021 7th International Conference on Automation, Robotics and Applications (ICARA)
  • 2021
TLDR
This paper aims at obtaining highly accurate 3D localization and recognition of objects in the road scene and tries to improve the performance of the basic architecture, AVOD-FPN, one of the best among sensor fusion-based methods for 3D object detection.
...
...

References

SHOWING 1-10 OF 43 REFERENCES
3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection
TLDR
This paper employs a convolutional neural net that exploits context and depth information to jointly regress to 3D bounding box coordinates and object pose and outperforms all existing results in object detection and orientation estimation tasks for all three KITTI object classes.
Monocular 3D Object Detection for Autonomous Driving
TLDR
This work proposes an energy minimization approach that places object candidates in 3D using the fact that objects should be on the ground-plane, and achieves the best detection performance on the challenging KITTI benchmark, among published monocular competitors.
Vehicle Detection from 3D Lidar Using Fully Convolutional Network
TLDR
This paper proposes to present the data in a 2D point map and use a single 2D end-to-end fully convolutional network to predict the objectness confidence and the bounding boxes simultaneously, and shows the state-of-the-art performance of the proposed method.
3D Object Proposals for Accurate Object Class Detection
TLDR
This method exploits stereo imagery to place proposals in the form of 3D bounding boxes in the context of autonomous driving and outperforms all existing results on all three KITTI object classes.
Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks
TLDR
This paper proposes a computationally efficient approach to detecting objects natively in 3D point clouds using convolutional neural networks (CNNs) by leveraging a feature-centric voting scheme to implement novel convolutionan layers which explicitly exploit the sparsity encountered in the input.
Sliding Shapes for 3D Object Detection in Depth Images
TLDR
This paper proposes to use depth maps for object detection and design a 3D detector to overcome the major difficulties for recognition, namely the variations of texture, illumination, shape, viewpoint, clutter, occlusion, self-occlusion and sensor noises.
Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images
  • S. Song, Jianxiong Xiao
  • Computer Science
    2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2016
TLDR
This work proposes the first 3D Region Proposal Network (RPN) to learn objectness from geometric shapes and the first joint Object Recognition Network (ORN) to extract geometric features in 3D and color features in 2D.
Joint SFM and detection cues for monocular 3D localization in road scenes
TLDR
This work presents a system for fast and highly accurate 3D localization of objects like cars in autonomous driving applications, using a single camera, and makes novel use of raw detection scores to allow the authors' 3D bounding boxes to adapt to better quality 3D cues.
Data-driven 3D Voxel Patterns for object category recognition
TLDR
A novel object representation is proposed, 3D Voxel Pattern (3DVP), that jointly encodes the key properties of objects including appearance,3D shape, viewpoint, occlusion and truncation.
FusionNet: 3D Object Classification Using Multiple Data Representations
TLDR
New Volumetric CNN (V-CNN) architectures are introduced and exploited to learn new features, which yield a significantly better classifier than using either of the representations in isolation.
...
...