Corpus ID: 236447690

DV-Det: Efficient 3D Point Cloud Object Detection with Dynamic Voxelization

  title={DV-Det: Efficient 3D Point Cloud Object Detection with Dynamic Voxelization},
  author={Zhaoyu Su and Pin Siang Tan and Yu-Hsing Wang},
  • Zhaoyu Su, Pin Siang Tan, Yu-Hsing Wang
  • Published 2021
  • Computer Science
  • ArXiv
In this work, we propose a novel two-stage framework for the efficient 3D point cloud object detection. Instead of transforming point clouds into 2D bird eye view projections, we parse the raw point cloud data directly in the 3D space yet achieve impressive efficiency and accuracy. To achieve this goal, we propose dynamic voxelization, a method that voxellizes points at local scale on-the-fly. By doing so, we preserve the point cloud geometry with 3D voxels, and therefore waive the dependence… Expand

Figures and Tables from this paper


Deep Hough Voting for 3D Object Detection in Point Clouds
This work proposes VoteNet, an end-to-end 3D object detection network based on a synergy of deep point set networks and Hough voting that achieves state-of-the-art 3D detection on two large datasets of real 3D scans, ScanNet and SUN RGB-D with a simple design, compact model size and high efficiency. Expand
STD: Sparse-to-Dense 3D Object Detector for Point Cloud
This work proposes a two-stage 3D object detection framework, named sparse-to-dense 3D Object Detector (STD), and implements a parallel intersection-over-union (IoU) branch to increase awareness of localization accuracy, resulting in further improved performance. Expand
VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection
  • Yin Zhou, Oncel Tuzel
  • Computer Science
  • 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
VoxelNet is proposed, a generic 3D detection network that unifies feature extraction and bounding box prediction into a single stage, end-to-end trainable deep network and learns an effective discriminative representation of objects with various geometries, leading to encouraging results in3D detection of pedestrians and cyclists. Expand
PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud
Extensive experiments on the 3D detection benchmark of KITTI dataset show that the proposed architecture outperforms state-of-the-art methods with remarkable margins by using only point cloud as input. Expand
PIXOR: Real-time 3D Object Detection from Point Clouds
PIXOR is proposed, a proposal-free, single-stage detector that outputs oriented 3D object estimates decoded from pixel-wise neural network predictions that surpasses other state-of-the-art methods notably in terms of Average Precision (AP), while still runs at 10 FPS. Expand
Frustum PointNets for 3D Object Detection from RGB-D Data
This work directly operates on raw point clouds by popping up RGBD scans and leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects. Expand
Structure Aware Single-Stage 3D Object Detection From Point Cloud
An auxiliary network is designed which converts the convolutional features in the backbone network back to point-level representations and an efficient part-sensitive warping operation is developed to align the confidences to the predicted bounding boxes. Expand
3DSSD: Point-Based 3D Single Stage Object Detector
This paper presents a lightweight point-based 3D single stage object detector 3DSSD to achieve decent balance of accuracy and efficiency, and outperforms all state-of-the-art voxel-based single-stage methods by a large margin. Expand
Fast Point R-CNN
This work presents a unified, efficient and effective framework for point-cloud based 3D object detection that achieves state-of-the-arts with a 15FPS detection rate. Expand
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
This paper designs a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input and provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing. Expand