ClusterVO: Clustering Moving Instances and Estimating Visual Odometry for Self and Surroundings
@article{Huang2020ClusterVOCM, title={ClusterVO: Clustering Moving Instances and Estimating Visual Odometry for Self and Surroundings}, author={Jiahui Huang and Sheng Yang and Tai-Jiang Mu and Shimin Hu}, journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2020}, pages={2165-2174} }
We present ClusterVO, a stereo Visual Odometry which simultaneously clusters and estimates the motion of both ego and surrounding rigid clusters/objects. Unlike previous solutions relying on batch input or imposing priors on scene structure or dynamic object models, ClusterVO is online, general and thus can be used in various scenarios including indoor scene understanding and autonomous driving. At the core of our system lies a multi-level probabilistic association mechanism and a heterogeneous…
Figures and Tables from this paper
27 Citations
DynaSLAM II: Tightly-Coupled Multi-Object Tracking and SLAM
- Computer ScienceIEEE Robotics and Automation Letters
- 2021
DynaSLAM II is presented, a visual SLAM system for stereo and RGB-D camera configurations that tightly integrates the multi-object tracking capability and makes use of instance semantic segmentation and ORB features to track dynamic objects.
Multimotion Visual Odometry (MVO)
- Computer ScienceArXiv
- 2021
Multimotion Visual Odometry (MVO), a multimotion estimation pipeline that estimates the full SE (3) trajectory of every motion in the scene, including the sensor egomotion, without relying on appearance-based information, is presented.
S3LAM: Structured Scene SLAM
- Computer ScienceArXiv
- 2021
This work proposes a new SLAM system based on ORB-SLAM2 that creates a semantic map made of clusters of points corresponding to objects instances and structures in the scene that improves both camera localization and reconstruction and enables a better understanding of the scene.
Incorporating Large Vocabulary Object Detection and Tracking into Visual SLAM
- Computer Science
- 2020
This master’s thesis presents a dynamic visual simultaneous localization and mapping (SLAM) system based on a neural network for semantic tracking of 2D bounding boxes and bundle adjustment (BA) optimization of geometric keypoints and shows qualitative results of the object tracking of other object classes than cars and pedestrians.
DOT: Dynamic Object Tracking for Visual SLAM
- Computer Science2021 IEEE International Conference on Robotics and Automation (ICRA)
- 2021
DOT (Dynamic Object Tracking) combines instance segmentation and multi-view geometry to generate masks for dynamic objects in order to allow SLAM systems based on rigid scene models to avoid such image areas in their optimizations.
View Birdification in the Crowd: Ground-Plane Localization from Perceived Movements
- Computer ScienceArXiv
- 2021
The method first estimates the observer’s movement and then localizes surrounding pedestrians for each frame while taking into account the local interactions between them, and derives a cascaded optimization method from a Bayesian perspective.
Semantics Aware Dynamic SLAM Based on 3D MODT
- Computer ScienceSensors
- 2021
The results suggest that the proposed dynamic SLAM framework can perform in real-time with budgeted computational resources and the fused MODT provides rich semantic information that can be readily integrated into SLAM.
Accurate Dynamic SLAM Using CRF-Based Long-Term Consistency
- Computer ScienceIEEE Transactions on Visualization and Computer Graphics
- 2022
This article presents a novel RGB-D SLAM approach for accurate camera pose tracking in dynamic environments, providing a more accurate dynamic 3D landmark detection method, followed by the use of long-term consistency via conditional random fields, which leverages long- term observations from multiple frames.
A Switching-Coupled Backend for Simultaneous Localization and Dynamic Object Tracking
- Computer ScienceIEEE Robotics and Automation Letters
- 2021
This work proposes a novel switching-coupled back-end solution and theoretically derive its concrete form using probability representation based on the switching strategy and the proposed objects classification criteria, where the object uncertainty, observation quality and prior information are jointly considered.
Multiway Non-rigid Point Cloud Registration via Learned Functional Map Synchronization.
- Computer ScienceIEEE transactions on pattern analysis and machine intelligence
- 2022
SyNoRiM, a novel way to jointly register multiple non-rigid shapes by synchronizing the maps that relate learned functions defined on the point clouds, achieves a state-of-the-art performance in registration accuracy, while being flexible and efficient as it avoids the costly optimization over point-wise permutations by the use of basis function maps.
References
SHOWING 1-10 OF 53 REFERENCES
ClusterSLAM: A SLAM Backend for Simultaneous Rigid Body Clustering and Motion Estimation
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
Evaluations on both synthetic scenes and KITTI demonstrate the capability of the approach, and further experiments considering online efficiency also show the effectiveness of the method for simultaneous tracking of ego-motion and multiple objects.
Robust Dense Mapping for Large-Scale Dynamic Environments
- Computer Science2018 IEEE International Conference on Robotics and Automation (ICRA)
- 2018
A stereo-based dense mapping algorithm for large-scale dynamic urban environments that simultaneously reconstruct the static background, the moving objects, and the potentially moving but currently stationary objects separately, which is desirable for high-level mobile robotic tasks such as path planning in crowded environments.
Occlusion-Robust MVO: Multimotion Estimation Through Occlusion Via Motion Closure
- Computer Science2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2020
This paper presents a pipeline for estimating multiple motions, including the camera egomotion, in the presence of occlusions, and uses an expressive motion prior to estimate the SE(3) trajectory of every motion in the scene, even during temporary occlusion.
EM-Fusion: Dynamic Object-Level SLAM With Probabilistic Data Association
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This paper proposes a novel approach to dynamic SLAM with dense object-level representations, which represents rigid objects in local volumetric signed distance function (SDF) maps, and formulate multi-object tracking as direct alignment of RGB-D images with the SDF representations.
Multimotion Visual Odometry (MVO): Simultaneous Estimation of Camera and Third-Party Motions
- Computer Science2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2018
The traditional visual odometry pipeline is extended to estimate the full motion of both a stereo/RGB-D camera and the dynamic scene, and its performance is evaluated on a real-world dynamic dataset with ground truth for all motions from a motion capture system.
MID-Fusion: Octree-based Object-Level Multi-Instance Dynamic SLAM
- Computer Science2019 International Conference on Robotics and Automation (ICRA)
- 2019
This system is the first system to generate an object-level dynamic volumetric map from a single RGB-D camera, which can be used directly for robotic tasks and demonstrates its effectiveness by quantitatively and qualitatively testing it on both synthetic and real-world sequences.
Object scene flow for autonomous vehicles
- Computer Science2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2015
A novel model and dataset for 3D scene flow estimation with an application to autonomous driving by representing each element in the scene by its rigid motion parameters and each superpixel by a 3D plane as well as an index to the corresponding object.
Stereo Vision-based Semantic 3D Object and Ego-motion Tracking for Autonomous Driving
- Computer ScienceECCV
- 2018
Based on the object-aware-aided camera pose tracking which is robust in dynamic environments, in combination with the novel dynamic object bundle adjustment (BA) approach to fuse temporal sparse feature correspondences and the semantic 3D measurement model, 3D object pose, velocity and anchored dynamic point cloud estimation are obtained with instance accuracy and temporal consistency.
DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes
- Computer ScienceIEEE Robotics and Automation Letters
- 2018
DynaSLAM is a visual SLAM system that, building on ORB-SLAM2, adds the capabilities of dynamic object detection and background inpainting, and outperforms the accuracy of standard visualSLAM baselines in highly dynamic scenarios.
Beyond Pixels: Leveraging Geometry and Shape Cues for Online Multi-Object Tracking
- Computer Science2018 IEEE International Conference on Robotics and Automation (ICRA)
- 2018
This paper introduces geometry and object shape and pose costs for multi-object tracking in urban driving scenarios. Using images from a monocular camera alone, we devise pairwise costs for object…