DR-SPAAM: A Spatial-Attention and Auto-regressive Model for Person Detection in 2D Range Data
@article{Jia2020DRSPAAMAS, title={DR-SPAAM: A Spatial-Attention and Auto-regressive Model for Person Detection in 2D Range Data}, author={Dan Jia and Alexander Hermans and B. Leibe}, journal={2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, year={2020}, pages={10270-10277} }
Detecting persons using a 2D LiDAR is a challenging task due to the low information content of 2D range data. To alleviate the problem caused by the sparsity of the LiDAR points, current state-of-the-art methods fuse multiple previous scans and perform detection using the combined scans. The downside of such a backward looking fusion is that all the scans need to be aligned explicitly, and the necessary alignment operation makes the whole pipeline more expensive – often too expensive for real…
Figures and Tables from this paper
17 Citations
Self-Supervised Person Detection in 2D Range Data using a Calibrated Camera
- Computer Science2021 IEEE International Conference on Robotics and Automation (ICRA)
- 2021
This work proposes a method, which uses bounding boxes from an image-based detector on a calibrated camera to automatically generate training labels for 2D LiDAR-based person detectors, and shows that self-supervised detectors, trained or fine-tuned with pseudolabels, outperform detectors trained only on a different dataset.
2D vs. 3D LiDAR-based Person Detection on Mobile Robots
- Environmental Science2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2022
Person detection is a crucial task for mobile robots navigating in human-populated environments. LiDAR sensors are promising for this task, thanks to their accurate depth measurements and large field…
Domain and Modality Gaps for LiDAR-based Person Detection on Mobile Robots
- Computer ScienceArXiv
- 2021
A series of experiments are conducted, using the recently released JackRabbot dataset and the state-of-the-art detectors based on 3D or 2D LiDAR sensors (CenterPoint and DR-SPAAM respectively), to understand if detectors pretrained on driving datasets can achieve good performance on the mobile robot scenarios, for which there are currently no trained models readily available.
Cross-Modal Analysis of Human Detection for Robotics: An Industrial Case Study
- Computer Science2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2021
A systematic cross-modal analysis of sensor-algorithm combinations typically used in robotics is conducted, comparing the performance of state-of-the-art person detectors for 2D range data, 3D lidar, and RGB-D data as well as selected combinations thereof in a challenging industrial use-case.
Robotic Vision for Human-Robot Interaction and Collaboration: A Survey and Systematic Review
- Computer Science, ArtACM Transactions on Human-Robot Interaction
- 2022
It was found that robotic vision was often used in action and gesture recognition, robot movement in human spaces, object handover and collaborative actions, social communication and learning from demonstration.
Sensor fusion for functional safety of autonomous mobile robots in urban and industrial environments
- Computer Science2022 IEEE 27th International Conference on Emerging Technologies and Factory Automation (ETFA)
- 2022
This study reviews the state of the art sensors and pedestrian detection methods and shows the benefit of sensor fusion technologies based on artificial intelligence and the limitations of these methods for industrial outdoor and urban AMR safety applications and proposes methods to overcome these drawbacks.
Human-Centered Navigation and Person-Following with Omnidirectional Robot for Indoor Assistance and Monitoring
- Computer ScienceRobotics
- 2022
This paper presents a novel human-centered navigation system that successfully combines a real-time visual perception system with the mobility advantages provided by an omnidirectional robotic platform to precisely adjust the robot orientation and monitor a person while navigating.
Pedestrian-Robot Interactions on Autonomous Crowd Navigation: Reactive Control Methods and Evaluation Metrics
- Computer Science2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2022
It is concluded that the reactive controller fulfils a necessary task of fast and continuous adaptation to crowd navigation, and it should be coupled with high-level planners for environmental and situational awareness.
LiDAR-based detection, tracking, and property estimation: A contemporary review
- ArtNeurocomputing
- 2022
Control of adaptive running platform based on machine vision technologies and neural networks
- Computer ScienceNeural Computing and Applications
- 2022
The scientific novelty of the study consists in the formalization and comparison of various control methods of adaptive running platforms, methods of positioning a person on them (using cameras and trackers), which will expand the area of knowledge about the optimal control functions of this class of devices.
References
SHOWING 1-10 OF 42 REFERENCES
SECOND: Sparsely Embedded Convolutional Detection
- Computer ScienceSensors
- 2018
An improved sparse convolution method for Voxel-based 3D convolutional networks is investigated, which significantly increases the speed of both training and inference and introduces a new form of angle loss regression to improve the orientation estimation performance.
Multi-view 3D Object Detection Network for Autonomous Driving
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
This paper proposes Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes and designs a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths.
Deep Person Detection in 2D Range Data
- Computer ScienceArXiv
- 2018
The deep learning based wheelchair and walker detector DROW is shown to be generalization to people, including small modifications that significantly boost DROW's performance, and the DROW dataset is extended with person annotations, making this the largest dataset of person annotations in 2D range data.
Joint 3D Proposal Generation and Object Detection from View Aggregation
- Computer Science2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2018
This work presents AVOD, an Aggregate View Object Detection network for autonomous driving scenarios that uses LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network.
STD: Sparse-to-Dense 3D Object Detector for Point Cloud
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This work proposes a two-stage 3D object detection framework, named sparse-to-dense 3D Object Detector (STD), and implements a parallel intersection-over-union (IoU) branch to increase awareness of localization accuracy, resulting in further improved performance.
Frustum PointNets for 3D Object Detection from RGB-D Data
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This work directly operates on raw point clouds by popping up RGBD scans and leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects.
4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
This work creates an open-source auto-differentiation library for sparse tensors that provides extensive functions for high-dimensional convolutional neural networks and proposes the hybrid kernel, a special case of the generalized sparse convolution, and trilateral-stationary conditional random fields that enforce spatio-temporal consistency in the 7D space-time-chroma space.
3D Semantic Segmentation with Submanifold Sparse Convolutional Networks
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This work introduces new sparse convolutional operations that are designed to process spatially-sparse data more efficiently, and uses them to develop Spatially-Sparse Convolutional networks, which outperform all prior state-of-the-art models on two tasks involving semantic segmentation of 3D point clouds.
Deep Hough Voting for 3D Object Detection in Point Clouds
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This work proposes VoteNet, an end-to-end 3D object detection network based on a synergy of deep point set networks and Hough voting that achieves state-of-the-art 3D detection on two large datasets of real 3D scans, ScanNet and SUN RGB-D with a simple design, compact model size and high efficiency.
PointPillars: Fast Encoders for Object Detection From Point Clouds
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
benchmarks suggest that PointPillars is an appropriate encoding for object detection in point clouds, and proposes a lean downstream network.