Corpus ID: 221370703

PV-RCNN: The Top-Performing LiDAR-only Solutions for 3D Detection / 3D Tracking / Domain Adaptation of Waymo Open Dataset Challenges

  title={PV-RCNN: The Top-Performing LiDAR-only Solutions for 3D Detection / 3D Tracking / Domain Adaptation of Waymo Open Dataset Challenges},
  author={Shaoshuai Shi and Chaoxu Guo and Jihan Yang and Hongsheng Li},
In this technical report, we present the top-performing LiDAR-only solutions for 3D detection, 3D tracking and domain adaptation three tracks in Waymo Open Dataset Challenges 2020. Our solutions for the competition are built upon our recent proposed PV-RCNN 3D object detection framework. Several variants of our PV-RCNN are explored, including temporal information incorporation, dynamic voxelization, adaptive training sample selection, classification with RoI features, etc. A simple model… Expand

Figures and Tables from this paper

RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection
Range Sparse Net (RSN) is a simple, efficient, and accurate 3D object detector that runs at more than 60 frames per second on a 150m× 150m detection region on Waymo Open Dataset (WOD) while being more accurate than previously published detectors. Expand


End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds
This paper aims to synergize the birds-eye view and the perspective view and proposes a novel end-to-end multi-view fusion (MVF) algorithm, which can effectively learn to utilize the complementary information from both and significantly improves detection accuracy over the comparable single-view PointPillars baseline. Expand
Frustum PointNets for 3D Object Detection from RGB-D Data
This work directly operates on raw point clouds by popping up RGBD scans and leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects. Expand
VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection
  • Yin Zhou, Oncel Tuzel
  • Computer Science
  • 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
VoxelNet is proposed, a generic 3D detection network that unifies feature extraction and bounding box prediction into a single stage, end-to-end trainable deep network and learns an effective discriminative representation of objects with various geometries, leading to encouraging results in3D detection of pedestrians and cyclists. Expand
PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud
Extensive experiments on the 3D detection benchmark of KITTI dataset show that the proposed architecture outperforms state-of-the-art methods with remarkable margins by using only point cloud as input. Expand
SECOND: Sparsely Embedded Convolutional Detection
An improved sparse convolution method for Voxel-based 3D convolutional networks is investigated, which significantly increases the speed of both training and inference and introduces a new form of angle loss regression to improve the orientation estimation performance. Expand
From Points to Parts: 3D Object Detection From Point Cloud With Part-Aware and Part-Aggregation Network
This paper extends the preliminary work PointRCNN to a novel and strong point-cloud-based 3D object detection framework, the part-aware and aggregation neural network, which outperforms all existing 3D detection methods and achieves new state-of-the-art on KITTI 3D objects detection dataset by utilizing only the LiDAR point cloud data. Expand
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
A hierarchical neural network that applies PointNet recursively on a nested partitioning of the input point set and proposes novel set learning layers to adaptively combine features from multiple scales to learn deep point set features efficiently and robustly. Expand
Submanifold Sparse Convolutional Networks
This work introduces a sparse convolutional operation tailored to processing sparse data that operates strictly on submanifolds, rather than "dilating" the observation with every layer in the network. Expand
Scalability in Perception for Autonomous Driving: Waymo Open Dataset
This work introduces a new large scale, high quality, diverse dataset, consisting of well synchronized and calibrated high quality LiDAR and camera data captured across a range of urban and suburban geographies, and studies the effects of dataset size and generalization across geographies on 3D detection methods. Expand
3D Multi-Object Tracking: A Baseline and New Evaluation Metrics
Surprisingly, although the proposed system does not use any 2D data as inputs, it achieves competitive performance on the KITTI 2D MOT leaderboard and runs at a rate of 207.4 FPS, achieving the fastest speed among all modern MOT systems. Expand