Depth From Videos in the Wild: Unsupervised Monocular Depth Learning From Unknown Cameras
- A. Gordon, Hanhan Li, Rico Jonschkowski, A. Angelova
- Computer ScienceIEEE International Conference on Computer Vision
- 10 April 2019
This work is the first to learn the camera intrinsic parameters, including lens distortion, from video in an unsupervised manner, thereby allowing us to extract accurate depth and motion from arbitrary videos of unknown origin at scale.
What Matters in Unsupervised Optical Flow
- Rico Jonschkowski, Austin Stone, J. Barron, A. Gordon, K. Konolige, A. Angelova
- Computer ScienceEuropean Conference on Computer Vision
- 8 June 2020
A new unsupervised flow technique is presented that significantly outperforms the previous unsuper supervised state-of-the-art and performs on par with supervised FlowNet2 on the KITTI 2015 dataset, while also being significantly simpler than related approaches.
Differentiable Particle Filters: End-to-End Learning with Algorithmic Priors
This work presents differentiable particle filters (DPFs), a differentiable implementation of the particle filter algorithm with learnable motion and measurement models that encode the structure of recursive state estimation with prediction and measurement update that operate on a probability distribution over states.
Learning state representations with robotic priors
This work identifies five robotic priors and explains how they can be used to learn pertinent state representations, and shows that the state representations learned by the method greatly improve generalization in reinforcement learning.
State Representation Learning in Robotics: Using Prior Knowledge about Physical Interaction
It is shown that the method extracts task-relevant state representations from highdimensional observations, even in the presence of task-irrelevant distractions, and that the state representations learned by the method greatly improve generalization in reinforcement learning.
The Distracting Control Suite - A Challenging Benchmark for Reinforcement Learning from Pixels
The experiments show that current RL methods for vision-based control perform poorly under distractions, and that their performance decreases with increasing distraction complexity, showing that new methods are needed to cope with the visual complexities of the real world.
KeyPose: Multi-View 3D Labeling and Keypoint Estimation for Transparent Objects
- Xingyu Liu, Rico Jonschkowski, A. Angelova, K. Konolige
- Computer ScienceComputer Vision and Pattern Recognition
- 5 December 2019
This paper establishes an easy method for capturing and labeling 3D keypoints on desktop objects with an RGB camera, and develops a deep neural network, called KeyPose, that learns to accurately predict object poses using 3DKeypoints, from stereo input, and works even for transparent objects.
Lessons from the Amazon Picking Challenge: Four Aspects of Building Robotic Systems
- Clemens Eppner, S. Höfer, O. Brock
- Computer ScienceInternational Joint Conference on Artificial…
- 18 June 2016
This work describes the winning entry to the Amazon Picking Challenge, and suggests to characterize robotic system building along four key aspects, each of them spanning a spectrum of solutions—modularity vs. integration, generality vs. assumptions, computation vs. embodiment, and planning vs. feedback.
MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale
A scalable and intuitive framework for specifying new tasks through user-provided examples of desired outcomes, a multi-robot collective learning system for data collection that simultaneously collects experience for multiple tasks, and a scalable and generalizable multitask deep reinforcement learning method, which is called MTOpt are developed.
Conditional Object-Centric Learning from Video
- Thomas Kipf, Gamaleldin F. Elsayed, Klaus Greff
- Computer ScienceInternational Conference on Learning…
- 24 November 2021
Using the temporal dynamics of video data in the form of optical flow and conditioning the model on simple object location cues can be used to enable segmenting and tracking objects in significantly more realistic synthetic data, which could pave the way for a range of weakly-supervised approaches and allow more effective interaction with trained models.