HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection

@article{Ye2020HVNetHV,
  title={HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection},
  author={Maosheng Ye and Shuangjie Xu and Tongyi Cao},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020},
  pages={1628-1637}
}
We present Hybrid Voxel Network (HVNet), a novel one-stage unified network for point cloud based 3D object detection for autonomous driving. Recent studies show that 2D voxelization with per voxel PointNet style feature extractor leads to accurate and efficient detector for large 3D scenes. Since the size of the feature map determines the computation and memory cost, the size of the voxel becomes a parameter that is hard to balance. A smaller voxel size gives a better performance, especially… 
DVFENet: Dual-branch voxel feature extraction network for 3D object detection
TLDR
A new 3D object detection framework (DVFENet) based on dual-branch voxel feature extraction, which can provide rich and complete 3D information and design a decoupled RPN module that can obtain task-specific features to reduce the task conflict.
HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object Detection
TLDR
A novel single-stage 3D detection method having the merit of both voxel-based and point-based features is introduced, and a new convolutional neural network architecture, dubbed HVPR, is proposed that integrates both features into a single 3D representation effectively and efficiently.
Voxel Transformer for 3D Object Detection
  • Jiageng Mao, Yujing Xue, +5 authors Chunjing Xu
  • Computer Science
    ArXiv
  • 2021
TLDR
Voxel Transformer is presented, a novel and effective voxel-based Transformer backbone for 3D object detection from point clouds that shows consistent improvement over the convolutional baselines while maintaining computational efficiency on the KITTI dataset and the Waymo Open dataset.
PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection
TLDR
The Point-Voxel Region-based Convolution Neural Networks (PV-RCNNs) are proposed for 3D object detection from point clouds and achieve state-of-the-art 3D detection performance on both the Waymo Open Dataset and the highly-competitive KITTI benchmark.
PVGNet: A Bottom-Up One-Stage 3D Object Detector with Integrated Multi-Level Features
TLDR
The proposed PVGNet outperforms previous state-of-the-art methods and ranks at the top of KITTI 3D/BEV detection leaderboards and performs group voting to get the final detection results.
OCM3D: Object-Centric Monocular 3D Object Detection
TLDR
It is argued that the local RoI information from the object image patch alone with a proper resizing scheme is a better input as it provides complete semantic clues meanwhile excludes irrelevant interferences and decomposes the confidence mechanism in monocular 3D object detection by considering the relationship between 3D objects and the associated 2D boxes.
EGFN: Efficient Geometry Feature Network for Fast Stereo 3D Object Detection
TLDR
This work proposes an efficient geometry feature generation network (EGFN) that outperforms YOLOStsereo3D, the advanced fast method, by 5.16% on mAP3d at the cost of merely additional 12 ms and hence achieves a better trade-off between accuracy and efficiency for stereo 3D object detection.
Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection
  • Jiageng Mao, Minzhe Niu, Haoyue Bai, Xiaodan Liang, Hang Xu, Chunjing Xu
  • Computer Science
    ArXiv
  • 2021
TLDR
This work presents a flexible and high-performance framework, named Pyramid R-CNN, for two-stage 3D object detection from point clouds, which outperforms the state-of-the-art 3D detection models by a large margin on both the KITTI dataset and the Waymo Open dataset.
Behind the Curtain: Learning Occluded Shapes for 3D Object Detection
  • Qiangeng Xu, Yiqi Zhong, U. Neumann
  • Computer Science
    ArXiv
  • 2021
TLDR
A novel LiDAR-based 3D object detection model, dubbed Behind the Curtain Detector (BtcDet), which learns the object shape priors and estimates the complete object shapes that are partially occluded (curtained) in point clouds.
From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection
  • Jiajun Deng, Wengang Zhou, Yanyong Zhang, Houqiang Li
  • Computer Science
    IEEE Transactions on Circuits and Systems for Video Technology
  • 2021
TLDR
The proposed H23D R-CNN provides a new angle to take full advantage of complementary information in the perspective view and the bird-eye view with an efficient framework and demonstrates the superiority of the method over the state-of-the-art algorithms with respect to both effectiveness and efficiency.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 41 REFERENCES
VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection
  • Yin Zhou, Oncel Tuzel
  • Computer Science
    2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
TLDR
VoxelNet is proposed, a generic 3D detection network that unifies feature extraction and bounding box prediction into a single stage, end-to-end trainable deep network and learns an effective discriminative representation of objects with various geometries, leading to encouraging results in3D detection of pedestrians and cyclists.
Voxel-FPN: multi-scale voxel feature aggregation in 3D object detection from point clouds
TLDR
Voxel-FPN is presented, a novel one-stage 3D object detector that utilizes raw data from LIDAR sensors only that has better performance on extracting features from point data and demonstrates its superiority over some baselines on the challenging KITTI-3D benchmark.
Multi-view 3D Object Detection Network for Autonomous Driving
TLDR
This paper proposes Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes and designs a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths.
Joint 3D Proposal Generation and Object Detection from View Aggregation
TLDR
This work presents AVOD, an Aggregate View Object Detection network for autonomous driving scenarios that uses LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network.
End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds
TLDR
This paper aims to synergize the birds-eye view and the perspective view and proposes a novel end-to-end multi-view fusion (MVF) algorithm, which can effectively learn to utilize the complementary information from both and significantly improves detection accuracy over the comparable single-view PointPillars baseline.
STD: Sparse-to-Dense 3D Object Detector for Point Cloud
TLDR
This work proposes a two-stage 3D object detection framework, named sparse-to-dense 3D Object Detector (STD), and implements a parallel intersection-over-union (IoU) branch to increase awareness of localization accuracy, resulting in further improved performance.
PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud
TLDR
Extensive experiments on the 3D detection benchmark of KITTI dataset show that the proposed architecture outperforms state-of-the-art methods with remarkable margins by using only point cloud as input.
Fast Point R-CNN
TLDR
This work presents a unified, efficient and effective framework for point-cloud based 3D object detection that achieves state-of-the-arts with a 15FPS detection rate.
PIXOR: Real-time 3D Object Detection from Point Clouds
TLDR
PIXOR is proposed, a proposal-free, single-stage detector that outputs oriented 3D object estimates decoded from pixel-wise neural network predictions that surpasses other state-of-the-art methods notably in terms of Average Precision (AP), while still runs at 10 FPS.
HDNET: Exploiting HD Maps for 3D Object Detection
TLDR
It is shown that High-Definition maps provide strong priors that can boost the performance and robustness of modern 3D object detectors and a single stage detector is designed that extracts geometric and semantic features from the HD maps.
...
1
2
3
4
5
...