SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection
@article{Bhattacharyya2021SADet3DSB, title={SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection}, author={Prarthana Bhattacharyya and Chengjie Huang and K. Czarnecki}, journal={2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)}, year={2021}, pages={3022-3031} }
Existing point-cloud based 3D object detectors use convolution-like operators to process information in a local neighbourhood with fixed-weight kernels and aggregate global context hierarchically. However, non-local neural networks and self-attention for 2D vision have shown that explicitly modeling long-range interactions can lead to more robust and competitive models. In this paper, we propose two variants of self-attention for contextual modeling in 3D object detection by augmenting…
Figures and Tables from this paper
14 Citations
RAANet: Range-Aware Attention Network for LiDAR-based 3D Object Detection with Auxiliary Density Level Estimation
- Computer ScienceArXiv
- 2021
Range-Aware Attention Network (RAANet), which extracts more powerful BEV features and generates superior 3D object detections, and proposes a novel auxiliary loss for density estimation to further enhance the detection accuracy of RAANet for occluded objects.
Real-time Hierarchical Soft Attention-based 3D Object Detection in Point Clouds
- Computer Science2022 26th International Conference on Pattern Recognition (ICPR)
- 2022
A real-time Hierarchical Soft Attention Network (HSAN) is proposed to employ soft attention in the backbone of the original network to increase the detection accuracy without slowing down its inference speed.
Accurate and Real-Time 3D Pedestrian Detection Using an Efficient Attentive Pillar Network
- Computer ScienceIEEE Robotics and Automation Letters
- 2023
This work introduces a stackable Pillar Aware Attention (PAA) module to enhance pillar feature extraction while suppressing noises in point clouds, and presents Mini-BiFPN, a small yet effective feature network that creates bidirectional information flow and multi-level cross-scale feature fusion to better integrate multi-resolution features.
AGS-SSD: Attention-Guided Sampling for 3D Single-Stage Detector
- Computer ScienceElectronics
- 2022
An attention-guided downsampling method for point-cloud-based 3D object detection, named AGS-SSD, which achieves significant improvements with novel architectures against the baseline and runs at 24 frames per second for inference.
PiFeNet: Pillar-Feature Network for Real-Time 3D Pedestrian Detection from Point Cloud
- Computer ScienceArXiv
- 2021
This work introduces a stackable Pillar Aware Attention (PAA) module for enhanced pillar features extraction while suppressing noises in the point clouds and presents Mini-BiFPN, a small yet effective feature network that creates bidirectional information flow and multi-level cross-scale feature fusion to better integrate multi-resolution features.
3D Object Detection Combining Semantic and Geometric Features from Point Clouds
- Computer ScienceArXiv
- 2021
The VTPM is a Voxel-Point-Based Module that finally implements 3D object detection in point space, which is more conducive to the detection of small-size objects and avoids the presets of anchors in inference stage.
D-Align: Dual Query Co-attention Network for 3D Object Detection Based on Multi-frame Point Cloud Sequence
- Computer ScienceArXiv
- 2022
A new 3D object detector, named D-Align, is proposed, which can effectively produce strong bird’s-eye-view (BEV) features by aligning and aggregating the features obtained from a sequence of point sets.
DANC-Net: Dual-Attention and Negative Constraint Network for Point Cloud Classification
- Computer Science, Environmental ScienceInternational Journal of Antennas and Propagation
- 2022
In the DANC-Net, the dual-attention mechanism is utilized to strengthen the interaction between local features of point cloud signal from both channel and space, thereby improving the expression ability of extracted features.
Point Density-Aware Voxels for LiDAR 3D Object Detection
- Environmental Science, Computer Science2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2022
Point Density-Aware Voxel network (PDV) is an end-to-end two stage LiDAR 3D object detection architecture that outperforms all state-of-the-art methods on the Waymo Open Dataset and achieves competitive results on the KITTI dataset.
3D Vision with Transformers: A Survey
- Computer ScienceArXiv
- 2022
A systematic and thorough review of more than 100 transformers methods for different 3D vision tasks, including classification, segmentation, detection, completion, pose estimation, and others, and compares their performance to common non-transformer methods on 12 3D benchmarks.
References
SHOWING 1-10 OF 57 REFERENCES
Scanet: Spatial-channel Attention Network for 3D Object Detection
- Computer ScienceICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2019
A novel Spatial-Channel Attention Network (SCANet), a two-stage detector that takes both LIDAR point clouds and RGB images as input to generate 3D object estimates, and a new multi-level fusion scheme for accurate classification and 3D bounding box regression is designed.
MLCVNet: Multi-Level Context VoteNet for 3D Object Detection
- Computer Science2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020
This paper introduces three context modules into the voting and classifying stages of VoteNet to encode contextual information at different levels, and proposes Multi-Level Context VoteNet (MLCVNet) to recognize 3D objects correlatively, building on the state-of-the-art VoteNet.
TANet: Robust 3D Object Detection from Point Clouds with Triple Attention
- Computer ScienceAAAI
- 2020
A novel TANet is introduced in this paper, which mainly contains a Triple Attention (TA) module, and a Coarse-to-Fine Regression (CFR) module that boosts the accuracy of localization without excessive computation cost.
PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation
- Computer ScienceNeurIPS
- 2019
A novel 3D Domain Adaptation Network for point cloud data (PointDAN) is proposed, which jointly aligns the global and local features in multi-level and demonstrates the superiority of the model over the state-of-the-art general-purpose DA methods.
3D Object Detection with Pointformer
- Computer Science, Environmental Science2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2021
This paper proposes Pointformer, a Transformer backbone designed for 3D point clouds to learn features effectively, and introduces an efficient coordinate refinement module to shift down-sampled points closer to object centroids, which improves object proposal generation.
Attentional ShapeContextNet for Point Cloud Recognition
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
The resulting model, called ShapeContextNet, consists of a hierarchy with modules not relying on a fixed grid while still enjoying properties similar to those in convolutional neural networks - being able to capture and propagate the object part information.
Deep Hough Voting for 3D Object Detection in Point Clouds
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This work proposes VoteNet, an end-to-end 3D object detection network based on a synergy of deep point set networks and Hough voting that achieves state-of-the-art 3D detection on two large datasets of real 3D scans, ScanNet and SUN RGB-D with a simple design, compact model size and high efficiency.
Frustum PointNets for 3D Object Detection from RGB-D Data
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This work directly operates on raw point clouds by popping up RGBD scans and leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects.
PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
Extensive experiments on the 3D detection benchmark of KITTI dataset show that the proposed architecture outperforms state-of-the-art methods with remarkable margins by using only point cloud as input.
Attentional PointNet for 3D-Object Detection in Point Clouds
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
- 2019
This study proposes Attentional PointNet, which is a novel end-to-end trainable deep architecture for object detection in point clouds that extends the theory of visual attention mechanisms to 3D point clouds and introduces a new recurrent 3D Localization Network module.