• Corpus ID: 1541419

Towards Adapting Deep Visuomotor Representations from Simulated to Real Environments

  title={Towards Adapting Deep Visuomotor Representations from Simulated to Real Environments},
  author={Eric Tzeng and Coline Devin and Judy Hoffman and Chelsea Finn and Xingchao Peng and Sergey Levine and Kate Saenko and Trevor Darrell},
Real-world robotics problems often occur in domains that differ significantly from the robot's prior training environment. For many robotic control tasks, real world experience is expensive to obtain, but data is easy to collect in either an instrumented environment or in simulation. We propose a novel domain adaptation approach for robot perception that adapts visual representations learned on a large easy-to-obtain source dataset (e.g. synthetic images) to a target real-world domain, without… 

Figures and Tables from this paper

Sim-to-Real Robot Learning from Pixels with Progressive Nets
This work proposes using progressive networks to bridge the reality gap and transfer learned policies from simulation to the real world, and presents an early demonstration of this approach with a number of experiments in the domain of robot manipulation that focus on bridging thereality gap.
The Curious Robot: Learning Visual Representations via Physical Interactions
This work builds one of the first systems on a Baxter platform that pushes, pokes, grasps and observes objects in a tabletop environment, with each datapoint providing supervision to a shared ConvNet architecture allowing us to learn visual representations.
Syn2Real: A New Benchmark forSynthetic-to-Real Visual Domain Adaptation
A new large-scale benchmark called Syn2Real is presented, which consists of a synthetic domain rendered from 3D object models and two real-image domains containing the same object categories, and it is concluded that developing adaptation methods that work well across all three tasks presents a significant future challenge for syn2real domain transfer.
Transferring Visuomotor Learning from Simulation to the Real World for Robotics Manipulation Tasks
This work solves the hand-eye coordination task using a visuomotor deep neural network predictor that estimates the arm's joint configuration given a stereo image pair of the arm and the underlying head configuration and demonstrates that this enables accurate reaching of objects while circumventing manual fine-calibration of the robot.
Transferring visuomotor learning from simulation to the real world for manipulation tasks in a humanoid robot
This work solves the hand-eye coordination task using a visuomotor deep neural network predictor that estimates the arm’s joint configuration given a stereo image pair of the arm and the underlying head configuration and demonstrates that this enables accurate reaching of objects while circumventing manual finecalibration of the robot.
A Data-Efficient Framework for Training and Sim-to-Real Transfer of Navigation Policies
This work introduces a robust framework that plans in simulation and transfers well to the real environment, consisting of the encoder and planner modules, and shows successful planning performances in different navigation tasks.
3D Simulation for Robot Arm Control with Deep Q-Learning
This work presents an approach which uses 3D simulations to train a 7-DOF robotic arm in a control task without any prior knowledge, and presents preliminary results in direct transfer of policies over to a real robot, without any further training.
Learning Real-World Robot Policies by Dreaming
The dreaming model can emulate samples equivalent to a sequence of images from the actual environment, technically by learning an action-conditioned future representation/scene regressor, and enables robot learning of policies that transfer to the real-world.
Zero-Shot Reinforcement Learning with Deep Attention Convolutional Neural Networks
This work theoretically prove and empirically demonstrate that a deep attention convolutional neural network (DACNN) with specific visual sensor configuration performs as well as training on a dataset with high domain and parameter variation at lower computational complexity.
VisDA: The Visual Domain Adaptation Challenge
The 2017 Visual Domain Adaptation (VisDA) dataset and challenge, a large-scale testbed for unsupervised domain adaptation across visual domains, is presented and a baseline performance analysis using various domain adaptation models that are currently popular in the field is provided.


Deep spatial autoencoders for visuomotor learning
This work presents an approach that automates state-space construction by learning a state representation directly from camera images by using a deep spatial autoencoder to acquire a set of feature points that describe the environment for the current task, such as the positions of objects.
End-to-End Training of Deep Visuomotor Policies
This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method.
Exploring Invariances in Deep Convolutional Neural Networks Using Synthetic Images
This work uses synthetic images to probe DCNN invariance to object-class variations caused by 3D shape, pose, and photorealism, and shows that DCNNs used as a fixed representation exhibit a large amount of invariances to these factors, but, if allowed to adapt, can still learn effectively from synthetic data.
Learning Transferable Policies for Monocular Reactive MAV Control
This paper proposes a generic framework for learning transferable motion policies, presented in the context of an autonomous MAV flight using monocular reactive control, and demonstrates the efficacy of the proposed approach through extensive real-world flight experiments in outdoor cluttered environments.
Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control
This paper introduces a machine learning based system for controlling a robotic manipulator with visual perception only. The capability to autonomously learn robot controllers solely from raw-pixel
Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours
  • Lerrel Pinto, A. Gupta
  • Computer Science
    2016 IEEE International Conference on Robotics and Automation (ICRA)
  • 2016
This paper takes the leap of increasing the available training data to 40 times more than prior work, leading to a dataset size of 50K data points collected over 700 hours of robot grasping attempts, which allows us to train a Convolutional Neural Network for the task of predicting grasp locations without severe overfitting.
Geodesic flow kernel for unsupervised domain adaptation
This paper proposes a new kernel-based method that takes advantage of low-dimensional structures that are intrinsic to many vision datasets, and introduces a metric that reliably measures the adaptability between a pair of source and target domains.
Autonomous reinforcement learning on raw visual input data in a real world application
A learning architecture, that is able to do reinforcement learning based on raw visual input data, and the resulting policy, learned only by success or failure, is hardly beaten by an experienced human player.
From Virtual to Reality: Fast Adaptation of Virtual Object Detectors to Real Domains
This work investigates the use of such freely available 3D models for multicategory 2D object detection and proposes a simple and fast adaptation approach based on decorrelated features, which performs comparably to existing methods trained on large-scale real image domains.
Learning task error models for manipulation
It is argued that in the context of grasping and manipulation, it is sufficient to achieve high accuracy in the task relevant state space and a data-driven approach that learns task error models that account for such unmodeled non-linearities is proposed.