Scale Drift-Aware Large Scale Monocular SLAM

  title={Scale Drift-Aware Large Scale Monocular SLAM},
  author={Hauke Malte Strasdat and J. M. M. Montiel and Andrew J. Davison},
  booktitle={Robotics: Science and Systems},
State of the art visual SLAM systems have recently been presented which are capable of accurate, large-scale and real-time performance, but most of these require stereo vision. [] Key Method In particular, we present a new pose-graph optimisation technique which allows for the efficient correction of rotation, translation and scale drift at loop closures. Especially, we describe the Lie group of similarity transformations and its relation to the corresponding Lie algebra.

Figures from this paper

Large-scale direct SLAM with stereo cameras
A novel Large-Scale Direct SLAM algorithm for stereo cameras (Stereo LSD-SLAM) that runs in real-time at high frame rate on standard CPUs, capable of handling aggressive brightness changes between frames - greatly improving the performance in realistic settings.
Full scaled 3D visual odometry from a single wearable omnidirectional camera
A method to recover the scale of the scene using an omnidirectional camera mounted on a helmet and an empirical formula based on biomedical experiments on human walking to cope with scale drift is presented.
LSD-SLAM: Large-Scale Direct Monocular SLAM
A novel direct tracking method which operates on \(\mathfrak{sim}(3)\), thereby explicitly detecting scale-drift, and an elegant probabilistic solution to include the effect of noisy depth values into tracking are introduced.
Recovering Stable Scale in Monocular SLAM Using Object-Supplemented Bundle Adjustment
This work describes a monocular approach that in addition to point measurements also considers object detections to resolve scale ambiguity and drift in a single-camera simultaneous localization and mapping system.
Long range monocular SLAM
This thesis explores approaches to two problems in the frame-rate computation of a priori unknown 3D scene structure and camera pose using a single camera, or monocular simultaneous localisation and mapping, and develops sparsified direct methods for monocular SLAM.
Real-time scalable structure from motion: from fundamental geometric vision to collaborative mapping
The main cornerstones of this dissertation are given by a number of novel geometrical solutions for absolute and relative camera-pose computation in the calibrated case, which sets a new standard in terms of efficiency and numerical robustness.
Object-aware bundle adjustment for correcting monocular scale drift
Unlike many previous visual odometry methods, this approach does not impose restrictions such as approximately constant camera height or planar roadways, and is therefore applicable to a much wider range of applications.
Robust large scale monocular visual SLAM
A new formalism is developed that builds upon the so called Known Rotation Problem to robustly estimate submaps (parts of the camera trajectory and the unknown environment) and a novel loopy belief propagation algorithm that is able to efficiently align a large number of submaps is proposed.
DT-SLAM: Deferred Triangulation for Robust SLAM
This work introduces a real-time visual SLAM system that incrementally tracks individual 2D features, and estimates camera pose by using matched 2D Features, regardless of the length of the baseline, and demonstrates that this system improves camera pose estimates and robustness, even with purely rotations.
Monocular Visual Odometry using a Planar Road Model to Solve Scale Ambiguity
This paper presents an approach to monocular visual odometry that compensates for drift in scale by applying constraints imposed by the known camera mounting and assumptions about the environment.


Mapping Large Loops with a Single Hand-Held Camera
This paper presents a method for Simultaneous Localization and Mapping (SLAM) relying on a monocular camera as the only sensor which is able to build outdoor, closedloop maps much larger than
Real-time simultaneous localisation and mapping with a single camera
  • A. Davison
  • Computer Science
    Proceedings Ninth IEEE International Conference on Computer Vision
  • 2003
This work presents a top-down Bayesian framework for single-camera localisation via mapping of a sparse set of natural features using motion modelling and an information-guided active measurement strategy, in particular addressing the difficult issue of real-time feature initialisation via a factored sampling approach.
Monocular Simultaneous Localisation and Mapping
This thesis advances the state of the art in monocular SLAM in terms of efficiency, richness of scene description, statistical correctness, and robustness, and mitigates the problems of tracking failure and large-scale localisation with a unified framework for loop closing and recovery.
Real-time monocular SLAM: Why filter?
This paper performs the first rigorous analysis of the relative advantages of filtering and sparse optimisation for sequential monocular SLAM and concludes that while filtering may have a niche in systems with low processing resources, in most modern applications keyframe optimisation gives the most accuracy per unit of computing time.
Unified Inverse Depth Parametrization for Monocular SLAM
This paper presents a new unified parametrization for point features within monocular SLAM which permits efficient and accurate representation of uncertainty during undelayed initialisation and beyond, all within the standard EKF (Extended Kalman Filter).
Highly scalable appearance-only SLAM - FAB-MAP 2.0
A new formulation of appearance-only SLAM suitable for very large scale navigation that naturally incorporates robustness against perceptual aliasing is described and demonstrated performing reliable online appearance mapping and loop closure detection over a 1,000 km trajectory.
Real-Time and Robust Monocular SLAM Using Predictive Multi-resolution Descriptors
A robust system for vision-based SLAM using a single camera which runs in real-time, typically around 30 fps, which provides superior performance over previous methods in terms of robustness to erratic motion, camera shake, and the ability to recover from measurement loss.
Large-Scale SLAM Building Conditionally Independent Local Maps: Application to Monocular Vision
The results show that the combination of conditional independence, which enables the system to share the camera and feature states between submaps, and local coordinates, which reduce the effects of linearization errors, allow us to obtain precise maps of large areas with pure monocular SLAM in real time.
Parallel Tracking and Mapping for Small AR Workspaces
A system specifically designed to track a hand-held camera in a small AR workspace, processed in parallel threads on a dual-core computer, that produces detailed maps with thousands of landmarks which can be tracked at frame-rate with accuracy and robustness rivalling that of state-of-the-art model-based systems.
A Constant-Time Efficient Stereo SLAM System
This article introduces a simple but novel representation of the environment in terms of a sequence of relative locations, and demonstrates precise local mapping and easy navigation using the relative map, and importantly shows that this can be done without requiring a global minimisation after loop closure.