MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding

  title={MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding},
  author={Kunyu Peng and Juncong Fei and Kailun Yang and Alina Roitberg and Jiaming Zhang and Frank Bieder and Philipp Heidenreich and Christoph Stiller and Rainer Stiefelhagen},
At the heart of all automated driving systems is the ability to sense the surroundings, e.g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg. While most previous works focus on sparse segmentation of the LiDAR input, dense output masks provide self-driving cars with almost complete environment information. In this paper, we introduce MASS a Multi-Attentional Semantic… 
Transfer beyond the Field of View: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation
P2PDA is introduced - a generic framework for Pinhole $\rightarrow$ Panoramic semantic segmentation which addresses the challenge of domain divergence with different variants of attention-augmented domain adaptation modules, enabling the transfer in output-, feature-, and feature confidence spaces.
Towards Robust Semantic Segmentation of Accident Scenes via Multi-Source Mixed Sampling and Meta-Learning
A Multi-source Meta-learning Unsupervised Domain Adaptation (MMUDA) framework, to improve the generalization of segmentation transformers to extreme accident scenes and achieves a mIoU score of 46 .
HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor
HIDA, a lightweight assistive system based on 3D point cloud instance segmentation with a solid-state LiDAR sensor, for holistic indoor detection and avoidance and interacts with users intuitively by acoustic feedback is proposed.
Trans4Map: Revisiting Holistic Top-down Mapping from Egocentric Images to Allocentric Semantics with Vision Transformers
An end-to-end one-stage Transformer-based framework for Mapping, termed Trans4Map, which achieves state-of-the-art results, reducing 67% parameters, yet gaining a +3 .
Panoramic Panoptic Segmentation: Insights Into Surrounding Parsing for Mobile Agents via Unsupervised Contrastive Learning
This work introduces panoramic panoptic segmentation, as the most holistic scene understanding, both in terms of Field of View (FoV) and image-level understanding for standard camera-based input and proposes a framework which allows model training on standard pinhole images and transfers the learned features to a different domain in a cost-minimizing way.
TRAVEL: Traversable Ground and Above-Ground Object Segmentation Using Graph Representation of 3D LiDAR Scans
TRAVEL is proposed, which performs traversable ground detection and object clustering simultaneously using the graph representation of a 3D point cloud and outperforms other state-of-the-art methods in terms of the conventional metrics and that the newly proposed evaluation metrics are meaningful for assessing the above-ground segmentation.
Is my Driver Observation Model Overconfident? Input-guided Calibration Networks for Reliable and Interpretable Confidence Estimates
This work examines how well the confidence values of modern driver observation models indeed match the probability of the correct outcome and shows that raw neural network-based approaches tend to signiflcantly overestimate their prediction quality.


SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences
A large dataset to propel research on laser-based semantic segmentation, which opens the door for the development of more advanced methods, but also provides plentiful data to investigate new research directions.
(AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network
(AF)2-S3Net, an end-to-end encoder-decoder CNN network for 3D LiDAR semantic segmentation with a novel multibranch attentive feature fusion module in the encoder and a unique adaptive feature selection module with feature map re-weighting in the decoder is proposed.
Scan-based Semantic Segmentation of LiDAR Point Clouds: An Experimental Study
This work performs a comprehensive experimental study of image-based semantic segmentation architectures for LiDAR point clouds and proposes an improved point cloud projection technique that does not suffer from systematic occlusions and a new kind of convolution layer with a reduced amount of weight-sharing along one of the two spatial dimensions.
Predicting Semantic Map Representations From Images Using Pyramid Occupancy Networks
  • Thomas Roddick, R. Cipolla
  • Computer Science
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
This work presents a simple, unified approach for estimating birds-eye-view maps of their environment directly from monocular images using a single end-to-end deep learning architecture.
PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation
This work proposes a new LiDAR-specific, KNN-free segmentation algorithm - PolarNet, which greatly increases the mIoU in three drastically different real urban LiDar single-scan segmentation datasets while retaining ultra low latency and near real-time throughput.
RangeNet ++: Fast and Accurate LiDAR Semantic Segmentation
This paper proposes a novel post-processing algorithm that deals with problems arising from this intermediate representation of range images as an intermediate representation in combination with a Convolutional Neural Network exploiting the rotating LiDAR sensor model.
SalsaNet: Fast Road and Vehicle Segmentation in LiDAR Point Clouds for Autonomous Driving
This paper introduces a deep encoder-decoder network, named SalsaNet, for efficient semantic segmentation of 3D LiDar point clouds, and introduces an auto-labeling process which transfers automatically generated labels from the camera to LiDAR.
Dual Attention Network for Scene Segmentation
New state-of-the-art segmentation performance on three challenging scene segmentation datasets, i.e., Cityscapes, PASCAL Context and COCO Stuff dataset is achieved without using coarse data.
Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges
This review paper attempts to systematically summarize methodologies and discuss challenges for deep multi-modal object detection and semantic segmentation in autonomous driving with an overview of on-board sensors on test vehicles, open datasets, and background information.
SemanticVoxels: Sequential Fusion for 3D Pedestrian Detection using LiDAR Point Cloud and Semantic Segmentation
This paper proposes a generalization of PointPainting to be able to apply fusion at different levels and demonstrates its strength in detecting challenging pedestrian cases and outperforms current state-of-the-art approaches.