Corpus ID: 57572955

Unsupervised Learning of Depth and Ego-Motion from Panoramic Video

@article{Sharma2019UnsupervisedLO,
  title={Unsupervised Learning of Depth and Ego-Motion from Panoramic Video},
  author={Alisha Sharma and Jonathan Ventura},
  journal={ArXiv},
  year={2019},
  volume={abs/1901.00979}
}
We introduce a convolutional neural network model for unsupervised learning of depth and ego-motion from cylindrical panoramic video. [...] Key Method In contrast to previous approaches for applying convolutional neural networks to panoramic imagery, we use the cylindrical panoramic projection which allows for the use of the traditional CNN layers such as convolutional filters and max pooling without modification. Our evaluation on synthetic and real data shows that unsupervised learning of depth and ego-motion…Expand
Unsupervised learning of depth estimation, camera motion prediction and dynamic object localization from video
TLDR
Experimental results on the KITTI and Cityscapes datasets demonstrate that the proposed unsupervised deep learning framework is more effective in depth estimation, camera motion prediction and dynamic object localization compared to previous models. Expand
PADENet: An Efficient and Robust Panoramic Monocular Depth Estimation Network for Outdoor Scenes
TLDR
A series of experiments show that the proposed PADENet and loss function can effectively improve the accuracy of panoramic depth prediction while maintaining a high level of robustness and reaching the state of the art on the CARLA Dataset. Expand
Semantic segmentation of panoramic images using a synthetic dataset
TLDR
Experimental results show that using panoramic images as training data is beneficial to the segmentation result, and it has been shown that by using pan oramic images with a 180 degree FoV asTraining data the model has better performance. Expand
A Probabilistic Scheme for Representation Learning with Radial Transform Images
  • H. Salehinejad, S. Valaee
  • Computer Science
  • ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
TLDR
The results show that the proposed representation method can achieve higher performance than other similar methods by translating a semantic segmentation problem to a classification problem when limited annotated images are available. Expand

References

SHOWING 1-10 OF 32 REFERENCES
Unsupervised Learning of Depth and Ego-Motion from Video
TLDR
Empirical evaluation demonstrates the effectiveness of the unsupervised learning framework for monocular depth performs comparably with supervised methods that use either ground-truth pose or depth for training, and pose estimation performs favorably compared to established SLAM systems under comparable input settings. Expand
Learning Depth from Monocular Videos Using Direct Methods
TLDR
It is argued that the depth CNN predictor can be learned without a pose CNN predictor and demonstrated empirically that incorporation of a differentiable implementation of DVO - along with a novel depth normalization strategy - substantially improves performance over state of the art that use monocular videos for training. Expand
Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints
TLDR
The main contribution is to explicitly consider the inferred 3D geometry of the whole scene, and enforce consistency of the estimated 3D point clouds and ego-motion across consecutive frames, and outperforms the state-of-the-art for both breadth and depth. Expand
Distortion-Aware Convolutional Filters for Dense Prediction in Panoramic Images
TLDR
This work proposes a learning approach for panoramic depth map estimation from a single image, thanks to a specifically developed distortion-aware deformable convolution filter, which can be trained by means of conventional perspective images, then used to regress depth forPanoramic images, thus bypassing the effort needed to create annotated pan oramic training dataset. Expand
GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose
TLDR
An adaptive geometric consistency loss is proposed to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively and achieves state-of-the-art results in all of the three tasks, performing better than previously unsupervised methods and comparably with supervised ones. Expand
Unsupervised Monocular Depth Estimation with Left-Right Consistency
TLDR
This paper proposes a novel training objective that enables the convolutional neural network to learn to perform single image depth estimation, despite the absence of ground truth depth data, and produces state of the art results for monocular depth estimation on the KITTI driving dataset. Expand
SfM-Net: Learning of Structure and Motion from Video
TLDR
A geometry-aware neural network for motion estimation in videos that decomposes frame-to-frame pixel motion in terms of scene and object depth, camera motion and 3D object rotations and translations, which often successfully segments the moving objects in the scene. Expand
Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields
TLDR
A deep convolutional neural field model for estimating depths from single monocular images, aiming to jointly explore the capacity of deep CNN and continuous CRF is presented, and a deep structured learning scheme which learns the unary and pairwise potentials of continuousCRF in a unified deep CNN framework is proposed. Expand
A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation
  • N. Mayer, Eddy Ilg, +4 authors T. Brox
  • Computer Science, Mathematics
  • 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2016
TLDR
This paper proposes three synthetic stereo video datasets with sufficient realism, variation, and size to successfully train large networks and presents a convolutional network for real-time disparity estimation that provides state-of-the-art results. Expand
SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images
TLDR
This work presents SphereNet, a novel deep learning framework which encodes invariance against such distortions explicitly into convolutional neural networks, and enables the transfer of existing perspective convolutionAL neural network models to the omnidirectional case. Expand
...
1
2
3
4
...