Multi-view 3D Object Detection Network for Autonomous Driving
@article{Chen2017Multiview3O, title={Multi-view 3D Object Detection Network for Autonomous Driving}, author={Xiaozhi Chen and Huimin Ma and Ji Wan and Bo Li and Tian Xia}, journal={2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2017}, pages={6526-6534} }
This paper aims at high-accuracy 3D object detection in autonomous driving scenario. [] Key Method The network is composed of two subnetworks: one for 3D object proposal generation and another for multi-view feature fusion. The proposal network generates 3D candidate boxes efficiently from the birds eye view representation of 3D point cloud. We design a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths. Experiments on…
Figures and Tables from this paper
1,508 Citations
Multi-View Adaptive Fusion Network for 3D Object Detection
- Computer ScienceArXiv
- 2020
An attentive pointwise fusion (APF) module to estimate the importance of the three sources with attention mechanisms that can achieve adaptive fusion of multi-view features in a pointwise manner is proposed and an end-to-end learnable network named MVAF-Net is designed to integrate these two components.
Joint 3D Proposal Generation and Object Detection from View Aggregation
- Computer Science2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2018
This work presents AVOD, an Aggregate View Object Detection network for autonomous driving scenarios that uses LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network.
Multi-view semantic learning network for point cloud based 3D object detection
- Computer ScienceNeurocomputing
- 2020
Integrating State-of-the-Art CNNs for Multi-Sensor 3D Vehicle Detection in Real Autonomous Driving Environments
- Computer Science2019 IEEE Intelligent Transportation Systems Conference (ITSC)
- 2019
Two new approaches to detect surrounding vehicles in 3D urban driving scenes and their corresponding Bird’s Eye View (BEV) are presented, removing the need to train on a specific dataset and showing a good capability of generalization to any domain, a key point for self-driving systems.
RoIFusion: 3D Object Detection From LiDAR and Vision
- Computer ScienceIEEE Access
- 2021
A deep neural network architecture is proposed to efficiently fuse the multi-modality features for 3D object detection by leveraging the advantages of LIDAR and camera sensors by aggregating a small set of 3D Region of Interests (RoIs) in the point clouds with the corresponding 2D RoIs in the images.
Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This paper proposes a monocular 3D object detection framework in the domain of autonomous driving, and proposes a multi-modal feature fusion module to embed the complementary RGB cue into the generated point clouds representation.
Multi-level Fusion Network for 3D Object Detection from Camera and LiDAR Data
- Computer Science
- 2020
A two-stage 3D object detection system, which takes input from the camera and LiDAR data, and outputs the localization and category of the 3D bounding box, using a novel feature extractor to learn the full-resolution scale features while keeping the computation speed coupled with a multimodal fusion Region Proposal Network (RPN) architecture.
M3D-RPN: Monocular 3D Region Proposal Network for Object Detection
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
M3D-RPN is able to significantly improve the performance of both monocular 3D Object Detection and Bird's Eye View tasks within the KITTI urban autonomous driving dataset, while efficiently using a shared multi-class model.
Cross-Modality 3D Object Detection
- Computer Science2021 IEEE Winter Conference on Applications of Computer Vision (WACV)
- 2021
This paper presents a novel two-stage multi-modal fusion network for 3D object detection, taking both binocular images and raw point clouds as input, and proposes to use pseudo LiDAR points from stereo matching as a data augmentation method to densify the LiDar points.
Improving Deep Multi-modal 3D Object Detection for Autonomous Driving
- Computer Science2021 7th International Conference on Automation, Robotics and Applications (ICARA)
- 2021
This paper aims at obtaining highly accurate 3D localization and recognition of objects in the road scene and tries to improve the performance of the basic architecture, AVOD-FPN, one of the best among sensor fusion-based methods for 3D object detection.
References
SHOWING 1-10 OF 43 REFERENCES
3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2018
This paper employs a convolutional neural net that exploits context and depth information to jointly regress to 3D bounding box coordinates and object pose and outperforms all existing results in object detection and orientation estimation tasks for all three KITTI object classes.
Monocular 3D Object Detection for Autonomous Driving
- Computer Science2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016
This work proposes an energy minimization approach that places object candidates in 3D using the fact that objects should be on the ground-plane, and achieves the best detection performance on the challenging KITTI benchmark, among published monocular competitors.
Vehicle Detection from 3D Lidar Using Fully Convolutional Network
- Computer ScienceRobotics: Science and Systems
- 2016
This paper proposes to present the data in a 2D point map and use a single 2D end-to-end fully convolutional network to predict the objectness confidence and the bounding boxes simultaneously, and shows the state-of-the-art performance of the proposed method.
3D Object Proposals for Accurate Object Class Detection
- Computer ScienceNIPS
- 2015
This method exploits stereo imagery to place proposals in the form of 3D bounding boxes in the context of autonomous driving and outperforms all existing results on all three KITTI object classes.
Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks
- Computer Science2017 IEEE International Conference on Robotics and Automation (ICRA)
- 2017
This paper proposes a computationally efficient approach to detecting objects natively in 3D point clouds using convolutional neural networks (CNNs) by leveraging a feature-centric voting scheme to implement novel convolutionan layers which explicitly exploit the sparsity encountered in the input.
Sliding Shapes for 3D Object Detection in Depth Images
- Computer ScienceECCV
- 2014
This paper proposes to use depth maps for object detection and design a 3D detector to overcome the major difficulties for recognition, namely the variations of texture, illumination, shape, viewpoint, clutter, occlusion, self-occlusion and sensor noises.
Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images
- Computer Science2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016
This work proposes the first 3D Region Proposal Network (RPN) to learn objectness from geometric shapes and the first joint Object Recognition Network (ORN) to extract geometric features in 3D and color features in 2D.
Joint SFM and detection cues for monocular 3D localization in road scenes
- Computer Science2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2015
This work presents a system for fast and highly accurate 3D localization of objects like cars in autonomous driving applications, using a single camera, and makes novel use of raw detection scores to allow the authors' 3D bounding boxes to adapt to better quality 3D cues.
Data-driven 3D Voxel Patterns for object category recognition
- Computer Science2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2015
A novel object representation is proposed, 3D Voxel Pattern (3DVP), that jointly encodes the key properties of objects including appearance,3D shape, viewpoint, occlusion and truncation.
FusionNet: 3D Object Classification Using Multiple Data Representations
- Computer Science, Environmental ScienceArXiv
- 2016
New Volumetric CNN (V-CNN) architectures are introduced and exploited to learn new features, which yield a significantly better classifier than using either of the representations in isolation.