Corpus ID: 235755367

VIN: Voxel-based Implicit Network for Joint 3D Object Detection and Segmentation for Lidars

  title={VIN: Voxel-based Implicit Network for Joint 3D Object Detection and Segmentation for Lidars},
  author={Yuanxin Zhong and Minghan Zhu and Huei Peng},
A unified neural network structure is presented for joint 3D object detection and point cloud segmentation in this paper. We leverage rich supervision from both detection and segmentation labels rather than using just one of them. In addition, an extension based on single-stage object detectors is proposed based on the implicit function widely used in 3D scene and object understanding. The extension branch takes the final feature map from the object detection module as input, and produces an… Expand

Figures and Tables from this paper


Sensor Fusion for Joint 3D Object Detection and Semantic Segmentation
An extension to LaserNet, an efficient and state-of-the-art LiDAR based 3D object detector, is presented and a method for fusing image data with the LiDar data is proposed and shown to improve the detection performance of the model especially at long ranges. Expand
Joint 3D Proposal Generation and Object Detection from View Aggregation
This work presents AVOD, an Aggregate View Object Detection network for autonomous driving scenarios that uses LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network. Expand
Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation
  • Xinge Zhu, Hui Zhou, +5 authors Dahua Lin
  • Computer Science
  • 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2021
A new framework for the outdoor LiDAR segmentation is proposed, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern while maintaining these inherent properties. Expand
VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection
  • Yin Zhou, Oncel Tuzel
  • Computer Science
  • 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
VoxelNet is proposed, a generic 3D detection network that unifies feature extraction and bounding box prediction into a single stage, end-to-end trainable deep network and learns an effective discriminative representation of objects with various geometries, leading to encouraging results in3D detection of pedestrians and cyclists. Expand
Semantic Segmentation of 3D LiDAR Data in Dynamic Scene Using Semi-Supervised Learning
The qualitative and quantitative experiments show that the combination of a few annotations and large amount of constraint data significantly enhances the effectiveness and scene adaptability, resulting in greater than 10% improvement. Expand
Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic Segmentation
This work develops a 3D cylinder partition and a3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds and introduces a dimension-decomposition based context modeling module to explore the high-rank context information in point clouds in a progressive manner. Expand
Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving
This paper proposes to convert image-based depth maps to pseudo-LiDAR representations --- essentially mimicking the LiDAR signal, and achieves impressive improvements over the existing state-of-the-art in image- based performance. Expand
Monocular 3D Object Detection for Autonomous Driving
This work proposes an energy minimization approach that places object candidates in 3D using the fact that objects should be on the ground-plane, and achieves the best detection performance on the challenging KITTI benchmark, among published monocular competitors. Expand
Frustum PointNets for 3D Object Detection from RGB-D Data
This work directly operates on raw point clouds by popping up RGBD scans and leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects. Expand
SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud
An end-to-end pipeline called SqueezeSeg based on convolutional neural networks (CNN), which takes a transformed LiDAR point cloud as input and directly outputs a point-wise label map, which is then refined by a conditional random field (CRF) implemented as a recurrent layer. Expand