LiDAR R-CNN: An Efficient and Universal 3D Object Detector

@article{Li2021LiDARRA,
  title={LiDAR R-CNN: An Efficient and Universal 3D Object Detector},
  author={Zhichao Li and Feng Wang and Naiyan Wang},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021},
  pages={7542-7551}
}
LiDAR-based 3D detection in point cloud is essential in the perception system of autonomous driving. In this paper, we present LiDAR R-CNN, a second stage detector that can generally improve any existing 3D detector. To fulfil-l the real-time and high precision requirement in practice, we resort to point-based approach other than the popular voxel-based approach. However, we find an overlooked issue in previous work: Naively applying point-based methods like PointNet could make the learned… 
Cost-Aware Comparison of LiDAR-based 3D Object Detectors
TLDR
This work focuses on SECOND, a simple grid-based one-stage detector, and analyzes its performance under different costs by scaling its original architecture, and finds that, if allowed to use the same latency, SECOND can match the performance of PV-RCNN++, the current state of theart method on the Waymo Open Dataset.
Cost-Aware Evaluation and Model Scaling for LiDAR-Based 3D Object Detection
TLDR
A cost-aware evaluation of LiDAR-based 3D object detectors and it is found that, if allowed to use the same latency, SECOND can match the performance of PV-RCNN++, the current state-of-the-art method on the Waymo Open Dataset.
Embracing Single Stride 3D Object Detector with Sparse Transformer
TLDR
This paper proposes Single-stride Sparse Transformer (SST) to maintain the original resolution from the beginning to the end of the network, and addresses the problem of insufficient receptive in single-strides architectures.
Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds
TLDR
This paper proposes a highly-efficient single-stage point-based 3D detector, termed IA-SSD, that achieves a superior speed of 80+ frames-per-second on the KITTI dataset with a single RTX2080Ti GPU.
RangeDet: In Defense of Range View for LiDAR-based 3D Object Detection
TLDR
This paper proposes an anchor-free single-stage LiDAR-based 3D object detector – RangeDet, and designs three components to address two issues overlooked by previous works: the scale variation between nearby and far away objects and the inconsistency between the 2D range image coordinates used in feature extraction and the 3D Cartesian coordinate used in output.
Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection
TLDR
This work collects a series of real-world cases with noisy data distribution, and systematically formulate a robustness benchmark toolkit, that simulates these cases on any clean autonomous driving datasets, and holistically benchmark the state-of-the-art fusion methods for the first time.
BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework
TLDR
This work proposes a surprisingly simple yet novel fusion framework, dubbed BEVFusion, whose camera stream does not depend on the input of LiDAR data, thus addressing the downside of previous methods and is the first to handle realistic LiDar malfunction and can be deployed to realistic scenarios without any post-processing procedure.
TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers
TLDR
The proposed TransFusion, a robust solution to LiDARcamera fusion with a soft-association mechanism to handle inferior image conditions, achieves state-of-the-art performance on large-scale datasets and is extended to the 3D tracking task.
Multi-Modality Task Cascade for 3D Object Detection
TLDR
A novel MultiModality Task Cascade network (MTC-RCNN) that leverages 3D box proposals to improve 2D segmentation predictions, which are then used to further refine the 3D boxes and shows that including a 2D network between two stages of 3D modules significantly improves both 2D and 3D task performance.
OccAM's Laser: Occlusion-based Attribution Maps for 3D Object Detectors on LiDAR Data
TLDR
This paper proposes a method to generate attribution maps for the detected objects in order to better understand the behavior of black-box models, and shows a detailed evaluation of the attribution maps, which are interpretable and highly informative.
...
...

References

SHOWING 1-10 OF 59 REFERENCES
StarNet: Targeted Computation for Object Detection in Point Clouds
TLDR
This work presents an object detection system called StarNet designed specifically to take advantage of the sparse and 3D nature of point cloud data, and shows how this design leads to competitive or superior performance on the large Waymo Open Dataset and the KITTI detection dataset, as compared to convolutional baselines.
Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving
TLDR
This paper proposes to convert image-based depth maps to pseudo-LiDAR representations --- essentially mimicking the LiDAR signal, and achieves impressive improvements over the existing state-of-the-art in image- based performance.
Frustum PointNets for 3D Object Detection from RGB-D Data
TLDR
This work directly operates on raw point clouds by popping up RGBD scans and leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects.
PIXOR: Real-time 3D Object Detection from Point Clouds
TLDR
PIXOR is proposed, a proposal-free, single-stage detector that outputs oriented 3D object estimates decoded from pixel-wise neural network predictions that surpasses other state-of-the-art methods notably in terms of Average Precision (AP), while still runs at 10 FPS.
RangeDet: In Defense of Range View for LiDAR-based 3D Object Detection
TLDR
This paper proposes an anchor-free single-stage LiDAR-based 3D object detector – RangeDet, and designs three components to address two issues overlooked by previous works: the scale variation between nearby and far away objects and the inconsistency between the 2D range image coordinates used in feature extraction and the 3D Cartesian coordinate used in output.
LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving
TLDR
benchmark results show that this approach has significantly lower runtime than other recent detectors and that it achieves state-of-the-art performance when compared on a large dataset that has enough data to overcome the challenges of training on the range view.
CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection
  • Su Pang, D. Morris, H. Radha
  • Computer Science
    2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
  • 2020
TLDR
A novel Camera-LiDAR Object Candidates (CLOCs) fusion network that provides a low-complexity multi-modal fusion framework that significantly improves the performance of single-modality detectors.
Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic Segmentation
TLDR
This work develops a 3D cylinder partition and a3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds and introduces a dimension-decomposition based context modeling module to explore the high-rank context information in point clouds in a progressive manner.
ImVoteNet: Boosting 3D Object Detection in Point Clouds With Image Votes
TLDR
This work builds on top of VoteNet and proposes a 3D detection architecture called ImVoteNet specialized for RGB-D scenes, based on fusing 2D votes in images and 3D Votes in point clouds, advancing state-of-the-art results by 5.7 mAP.
Multi-view 3D Object Detection Network for Autonomous Driving
TLDR
This paper proposes Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes and designs a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths.
...
...