PIXOR: Real-time 3D Object Detection from Point Clouds

  title={PIXOR: Real-time 3D Object Detection from Point Clouds},
  author={Binh Yang and Wenjie Luo and Raquel Urtasun},
  journal={2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition},
We address the problem of real-time 3D object detection from point clouds in the context of autonomous driving. [] Key Method We utilize the 3D data more efficiently by representing the scene from the Bird's Eye View (BEV), and propose PIXOR, a proposal-free, single-stage detector that outputs oriented 3D object estimates decoded from pixel-wise neural network predictions.

Figures and Tables from this paper

Accurate and Real-Time Object Detection Based on Bird's Eye View on 3D Point Clouds
A single-stage deep neural network is proposed which is specially designed to combine the semantic and position information on point clouds to output a final feature map and achieves higher performance than the state-of-the-art methods on the KITTI BEV object detection benchmark.
Learning to Detect 3D Objects from Point Clouds in Real Time
This work proposes a novel neural network architecture along with the training and optimization details for detecting 3D objects in point cloud data that surpasses the state of the art in this domain both in terms of average precision and speed running at more than 30 FPS.
Real-time 3D Object Detection on Point Clouds
  • Enes
  • Computer Science
  • 2020
This thesis aims to improve PointPillars network by utilizing the positional encoding and extending the detection area and presents a simple scheme to train 360-degrees model with ground truths provided for only camera Field-of-View (FOV).
RUHSNet: 3D Object Detection Using Lidar Data in Real Time
This work proposes a novel neural network architecture along with the training and optimization details for detecting 3D objects in point cloud data that surpasses the state of the art in this domain both in terms of average precision and speed running at > 30 FPS.
Complexer-YOLO: Real-Time 3D Object Detection and Tracking on Semantic Point Clouds
This work presents a novel fusion of neural network based state-of-the-art 3D detector and visual semantic segmentation in the context of autonomous driving and introduces Scale-Rotation-Translation score (SRTs), a fast and highly parameterizable evaluation metric for comparison of object detections.
AA3DNet: Attention Augmented Real Time 3D Object Detection
  • Abhinav Sagar
  • Computer Science
    2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)
  • 2022
This work proposes a novel neural network architecture along with the training and optimization details for detecting 3D objects using point cloud data, and presents anchor design along with custom loss functions used in this work.
Real-Time 3D Object Detection From Point Cloud Through Foreground Segmentation
This paper proposes LIDAR-based 3D object detection based on foreground segmentation using a fully sparse convolutional network (FS23D), which outperforms the state-of-the-art LIDar-based method in speed and precision for both cars and cyclists.
High-level camera-LiDAR fusion for 3D object detection with machine learning
This framework uses a Machine Learning (ML) pipeline on a combination of monocular camera and LiDAR data to detect vehicles in the surrounding 3D space of a moving platform to demonstrate an efficient and accurate inference on a validation set.
3D Object Detection from Point Cloud
This project analyzes VoteNet – the recently proposed end-to-end deep learning network that leverages the Hough voting algorithm to detect 3D objects directly from the raw point cloud data, and achieves stateof-the-art results in 3D object detection tasks on two large datasets with interior 3D scans.
StarNet: Targeted Computation for Object Detection in Point Clouds
This work presents an object detection system called StarNet designed specifically to take advantage of the sparse and 3D nature of point cloud data, and shows how this design leads to competitive or superior performance on the large Waymo Open Dataset and the KITTI detection dataset, as compared to convolutional baselines.


3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection
This paper employs a convolutional neural net that exploits context and depth information to jointly regress to 3D bounding box coordinates and object pose and outperforms all existing results in object detection and orientation estimation tasks for all three KITTI object classes.
Monocular 3D Object Detection for Autonomous Driving
This work proposes an energy minimization approach that places object candidates in 3D using the fact that objects should be on the ground-plane, and achieves the best detection performance on the challenging KITTI benchmark, among published monocular competitors.
Sliding Shapes for 3D Object Detection in Depth Images
This paper proposes to use depth maps for object detection and design a 3D detector to overcome the major difficulties for recognition, namely the variations of texture, illumination, shape, viewpoint, clutter, occlusion, self-occlusion and sensor noises.
Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks
This paper proposes a computationally efficient approach to detecting objects natively in 3D point clouds using convolutional neural networks (CNNs) by leveraging a feature-centric voting scheme to implement novel convolutionan layers which explicitly exploit the sparsity encountered in the input.
Vehicle Detection from 3D Lidar Using Fully Convolutional Network
This paper proposes to present the data in a 2D point map and use a single 2D end-to-end fully convolutional network to predict the objectness confidence and the bounding boxes simultaneously, and shows the state-of-the-art performance of the proposed method.
Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images
  • S. Song, Jianxiong Xiao
  • Computer Science
    2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2016
This work proposes the first 3D Region Proposal Network (RPN) to learn objectness from geometric shapes and the first joint Object Recognition Network (ORN) to extract geometric features in 3D and color features in 2D.
3D fully convolutional network for vehicle detection in point cloud
  • Bo Li
  • Computer Science, Environmental Science
    2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
  • 2017
The fully convolutional network based detection techniques to 3D and apply to point cloud data is extended and verified on the task of vehicle detection from lidar point cloud for autonomous driving.
You Only Look Once: Unified, Real-Time Object Detection
Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
SSD: Single Shot MultiBox Detector
The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.
DenseBox: Unifying Landmark Localization with End to End Object Detection
DenseBox is introduced, a unified end-to-end FCN framework that directly predicts bounding boxes and object class confidences through all locations and scales of an image and shows that when incorporating with landmark localization during multi-task learning, DenseBox further improves object detection accuray.