DeepAVO: Efficient Pose Refining with Feature Distilling for Deep Visual Odometry

  title={DeepAVO: Efficient Pose Refining with Feature Distilling for Deep Visual Odometry},
  author={Ran Zhu and Mingkun Yang and Wang Liu and Rujun Song and Bo Yan and Zhuoling Xiao},
The technology for Visual Odometry (VO) that estimates the position and orientation of the moving object through analyzing the image sequences captured by on-board cameras, has been well investigated with the rising interest in autonomous driving. This paper studies monocular VO from the perspective of Deep Learning (DL). Unlike most current learning-based methods, our approach, called DeepAVO, is established on the intuition that features contribute discriminately to different motion patterns… 
1 Citations
Simultaneous Localization and Mapping Related Datasets: A Comprehensive Survey
A range of cohesive topics aboutSLAM related datasets are covered, including general collection methodology and fundamental characteristic dimensions, SLAM related tasks taxonomy and datasets categorization, introduction of state-of-the-arts, overview and comparison of existing datasets, review of evaluation criteria, and analyses and discussions about current limitations and future directions.


Guided Feature Selection for Deep Visual Odometry
This work proposes a dual-branch recurrent network to learn the rotation and translation separately by leveraging current Convolutional Neural Network for feature representation and Recurrent Neural Network (RNN) for image sequence reasoning.
Exploring Representation Learning With CNNs for Frame-to-Frame Ego-Motion Estimation
This work explores the use of convolutional neural networks to learn both the best visual features and the best estimator for the task of visual ego-motion estimation and shows that this approach is robust with respect to blur, luminance, and contrast anomalies and outperforms most state-of-the-art approaches even in nominal conditions.
DeepVO: Towards end-to-end visual odometry with deep Recurrent Convolutional Neural Networks
Extensive experiments on the KITTI VO dataset show competitive performance to state-of-the-art methods, verifying that the end-to-end Deep Learning technique can be a viable complement to the traditional VO systems.
D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry
D3VO tightly incorporates the predicted depth, pose and uncertainty into a direct visual odometry method to boost both the front-end tracking as well as the back-end non-linear optimization.
Flowdometry: An Optical Flow and Deep Learning Based Approach to Visual Odometry
A visual odometry system called Flowdometry is proposed based on optical flow and deep learning, which offers a 23.796x speedup over state-of-the-art methods using deep learning.
DeMoN: Depth and Motion Network for Learning Monocular Stereo
This work trains a convolutional network end-to-end to compute depth and camera motion from successive, unconstrained image pairs, and in contrast to the popular depth-from-single-image networks, DeMoN learns the concept of matching and better generalizes to structures not seen during training.
Dynamic Attention-based Visual Odometry
This paper proposes a dynamic attention-based visual odometry framework (DAVO), a learning-based VO method, for estimating the ego-motion of a monocular camera, and performs a number of experiments to examine the impacts of the dynamically adjusted weights on the accuracy of the evaluated trajectories.
End-to-end, sequence-to-sequence probabilistic visual odometry through deep neural networks
Competitive performance of the proposed ESP-VO to the state-of-the-art methods is shown, demonstrating a promising potential of the deep learning technique for VO and verifying that it can be a viable complement to current VO systems.
GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose
An adaptive geometric consistency loss is proposed to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively and achieves state-of-the-art results in all of the three tasks, performing better than previously unsupervised methods and comparably with supervised ones.