Corpus ID: 237091228

Is Pseudo-Lidar needed for Monocular 3D Object detection?

  title={Is Pseudo-Lidar needed for Monocular 3D Object detection?},
  author={Dennis Park and Rares Ambrus and Vitor Campanholo Guizilini and Jie Li and Adrien Gaidon},
Recent progress in 3D object detection from single images leverages monocular depth estimation as a way to produce 3D pointclouds, turning cameras into pseudo-lidar sensors. These two-stage detectors improve with the accuracy of the intermediate depth estimation network, which can itself be improved without manual labels via large-scale self-supervised learning. However, they tend to suffer from overfitting more than end-to-end methods, are more complex, and the gap with similar lidar-based… Expand
1 Citations

Figures and Tables from this paper

DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries
  • Yue Wang, V. Guizilini, Tianyuan Zhang, Yilun Wang, Hang Zhao, Justin Solomon
  • Computer Science
  • 2021
This top-down approach outperforms its bottom-up counterpart in which object bounding box prediction follows per-pixel depth estimation, since it does not suffer from the compounding error introduced by a depth prediction model. Expand


End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection
A new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end and is compatible with most state-of-the-art networks for both tasks and in combination with PointRCNN improves over PL consistently across all benchmarks. Expand
Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud
  • Xinshuo Weng, Kris Kitani
  • Computer Science
  • 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)
  • 2019
This work aims at bridging the performance gap between 3D sensing and 2D sensing for 3D object detection by enhancing LiDAR-based algorithms to work with single image input by enhancing pseudo-LiDAR end-to-end methods. Expand
Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving
This paper proposes to convert image-based depth maps to pseudo-LiDAR representations --- essentially mimicking the LiDAR signal, and achieves impressive improvements over the existing state-of-the-art in image- based performance. Expand
Learning Depth-Guided Convolutions for Monocular 3D Object Detection
  • Mingyu Ding, Yuqi Huo, +4 authors P. Luo
  • Computer Science
  • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
  • 2020
D4LCN overcomes the limitation of conventional 2D convolutions and narrows the gap between image representation and 3D point cloud representation, where the filters and their receptive fields can be automatically learned from image-based depth maps. Expand
Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving
This paper provides substantial advances to the pseudo-LiDAR framework through improvements in stereo depth estimation, and proposes a depth-propagation algorithm, guided by the initial depth estimates, to diffuse these few exact measurements across the entire depth map. Expand
Monocular 3D Object Detection via Geometric Reasoning on Keypoints
This paper proposes a novel keypoint-based approach for 3D object detection and localization from a single RGB image, building a multi-branch model around 2D keypoint detection in images and complement it with a conceptually simple geometric reasoning method. Expand
M3D-RPN: Monocular 3D Region Proposal Network for Object Detection
  • G. Brazil, Xiaoming Liu
  • Computer Science
  • 2019 IEEE/CVF International Conference on Computer Vision (ICCV)
  • 2019
M3D-RPN is able to significantly improve the performance of both monocular 3D Object Detection and Bird's Eye View tasks within the KITTI urban autonomous driving dataset, while efficiently using a shared multi-class model. Expand
3D Packing for Self-Supervised Monocular Depth Estimation
This work proposes a novel self-supervised monocular depth estimation method combining geometry with a new deep network, PackNet, learned only from unlabeled monocular videos, which outperforms other self, semi, and fully supervised methods on the KITTI benchmark. Expand
Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction
MonopolyPSR, a monocular 3D object detection method that leverages proposals and shape reconstruction, is presented and a novel projection alignment loss is devised to jointly optimize these tasks in the neural network to improve 3D localization accuracy. Expand
Frustum PointNets for 3D Object Detection from RGB-D Data
This work directly operates on raw point clouds by popping up RGBD scans and leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects. Expand