Adversarial Skill Networks: Unsupervised Robot Skill Learning from Video

  title={Adversarial Skill Networks: Unsupervised Robot Skill Learning from Video},
  author={Oier Mees and Markus Merklinger and Gabriel Kalweit and Wolfram Burgard},
  journal={2020 IEEE International Conference on Robotics and Automation (ICRA)},
Key challenges for the deployment of reinforcement learning (RL) agents in the real world are the discovery, representation and reuse of skills in the absence of a reward function. To this end, we propose a novel approach to learn a task-agnostic skill embedding space from unlabeled multi-view videos. Our method learns a general skill embedding independently from the task context by using an adversarial loss. We combine a metric learning loss, which utilizes temporal video coherence to learn a… Expand
Learning Dense Rewards for Contact-Rich Manipulation Tasks
This work provides an approach capable of extracting dense reward functions algorithmically from robots' high-dimensional observations, such as images and tactile feedback, and does not leverage adversarial training, and is thus less prone to the associated training instabilities. Expand
Self-Supervised Disentangled Representation Learning for Third-Person Imitation Learning
This paper presents a TPIL approach for robot tasks with egomotion, using a dual auto-encoder structure plus representation permutation loss and time-contrastive loss to ensure the state and viewpoint representations are well disentangled and shows the effectiveness of the approach. Expand
Hindsight for Foresight: Unsupervised Structured Dynamics Models from Physical Interaction
This work proposes a novel approach for modeling the dynamics of a robot’s interactions directly from unlabeled 3D point clouds and images, which leads to effective, interpretable models that can be used for visuomotor control and planning. Expand
Robot Program Parameter Inference via Differentiable Shadow Program Inversion
SPI enables the use of efficient first-order optimizers to infer optimal parameters for originally non-differentiable skills, including many skill variants currently used in production, and generalizes across task objectives, meaning that shadow programs do not need to be retrained to infer parameters for different task variants. Expand
Learning for a Robot: Deep Reinforcement Learning, Imitation Learning, Transfer Learning
A state-of-the-art survey on an intelligent robot with the capability of autonomous deciding and learning reveals that the latest research in deep learning and reinforcement learning has paved the way for highly complex tasks to be performed by robots. Expand
Few-Shot System Identification for Reinforcement Learning
The target of this work is to present a framework for facilitating the online system identification of different instances of the same dynamics class by learning a probability distribution of the dynamics conditioned on observed data with variational inference and show its reliability in robustly solving different instance of control problems without extra training in modelbased RL with maximum sample efficiency. Expand
Composing Pick-and-Place Tasks By Grounding Language
This work presents a robot system that follows unconstrained language instructions to pick and place arbitrary objects and effectively resolves ambiguities through dialogues and demonstrates the effectiveness of the method in understanding pick-and-place language instructions and sequentially composing them to solve tabletop manipulation tasks. Expand
Building Intelligent Navigation System for Mobile Robots Based on The SARSA Algorithm
XIRL: Cross-embodiment Inverse Reinforcement Learning
This work presents a self-supervised method for Cross-embodiment Inverse Reinforcement Learning (XIRL) that leverages temporal cycle-consistency constraints to learn deep visual embeddings that capture task progression from offline videos of demonstrations across multiple expert agents, each performing the same task differently due to embodiment differences. Expand
Adaptive Curriculum Generation from Demonstrations for Sim-to-Real Visuomotor Control
It is shown that training vision-based control policies in simulation while gradually increasing the difficulty of the task via ACGD improves the policy transfer to the real world. Expand


Learning Actionable Representations from Visual Observations
This work shows that the representations learned by agents observing themselves take random actions, or other agents perform tasks successfully, can enable the learning of continuous control policies using algorithms like Proximal Policy Optimization using only the learned embeddings as input. Expand
Learning an Embedding Space for Transferable Robot Skills
Time-Contrastive Networks: Self-Supervised Learning from Video
A self-supervised approach for learning representations and robotic behaviors entirely from unlabeled videos recorded from multiple viewpoints is proposed, and it is demonstrated that this representation can be used by a robot to directly mimic human poses without an explicit correspondence, and that it can be use as a reward function within a reinforcement learning algorithm. Expand
Unsupervised Perceptual Rewards for Imitation Learning
This work presents a method that is able to identify key intermediate steps of a task from only a handful of demonstration sequences, and automatically identify the most discriminative features for identifying these steps. Expand
Adversarial Discriminative Domain Adaptation
It is shown that ADDA is more effective yet considerably simpler than competing domain-adversarial methods, and the promise of the approach is demonstrated by exceeding state-of-the-art unsupervised adaptation results on standard domain adaptation tasks as well as a difficult cross-modality object classification task. Expand
Unsupervised Visuomotor Control through Distributional Planning Networks
This work aims to learn an unsupervised embedding space under which the robot can measure progress towards a goal for itself, and enables learning effective and control-centric representations that lead to more autonomous reinforcement learning algorithms. Expand
Conditional Adversarial Domain Adaptation
Conditional adversarial domain adaptation is presented, a principled framework that conditions the adversarial adaptation models on discriminative information conveyed in the classifier predictions to guarantee the transferability. Expand
Playing hard exploration games by watching YouTube
A two-stage method of one-shot imitation that allows an agent to convincingly exceed human-level performance on the infamously hard exploration games Montezuma's Revenge, Pitfall! and Private Eye for the first time, even if the agent is not presented with any environment rewards. Expand
Deep visual foresight for planning robot motion
This work develops a method for combining deep action-conditioned video prediction models with model-predictive control that uses entirely unlabeled training data and enables a real robot to perform nonprehensile manipulation — pushing objects — and can handle novel objects not seen during training. Expand
Composable Deep Reinforcement Learning for Robotic Manipulation
This paper shows that policies learned with soft Q-learning can be composed to create new policies, and that the optimality of the resulting policy can be bounded in terms of the divergence between the composed policies. Expand