• Corpus ID: 246706149

REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer

  title={REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer},
  author={Xingyu Liu and Deepak Pathak and Kris M. Kitani},
A popular paradigm in robotic learning is to train a policy from scratch for every new robot. This is not only inefficient but also often impractical for complex robots. In this work, we consider the problem of transferring a policy across two different robots with significantly different parameters such as kinematics and morphology. Existing approaches that train a new policy by matching the action or state transition distribution, including imitation learning methods, fail due to optimal… 

Figures and Tables from this paper

HERD: Continuous Human-to-Robot Evolution for Learning from Human Demonstration

This work shows that manipulation skills can be transferred from a human to a robot through the use of micro-evolutionary reinforcement learning, and proposes an algorithm for multi-dimensional evolution path searching that allows joint optimization of both the robot evolution path and the policy.

A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation

A method for learning a single policy that manipulates various forms of agents to solve various tasks by distilling a large amount of proficient behavioral data is explored, and the results show that when the policy faces unseen tasks, MTGv2-history may help to improve the performance.

Understanding the Complexity Gains of Single-Task RL with a Curriculum

Under mild regularity conditions on the curriculum, it is shown that sequentially solving each task in the multi-task RL problem is more computationally efficient than solving the original single-task problem, without any explicit exploration bonuses or other exploration strategies.



Hardware Conditioned Policies for Multi-Robot Transfer Learning

This work uses the kinematic structure directly as the hardware encoding and shows great zero-shot transfer to completely novel robots not seen during training and demonstrates that fine-tuning the policy network is significantly more sample-efficient than training a model from scratch.

Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning

The approach maintains a distribution over designs and uses reinforcement learning to optimize a control policy to maximize expected reward over the design distribution, and gives the controller access to design parameters to allow it to tailor its policy to each design in the distribution.

One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control

It is shown that a single modular policy can successfully generate locomotion behaviors for several planar agents with different skeletal structures such as monopod hoppers, quadrupeds, bipeds, and generalize to variants not seen during training -- a process that would normally require training and manual hyperparameter tuning for each morphology.

Modular Robot Design Synthesis with Deep Reinforcement Learning

This work uses deep reinforcement learning to create a search heuristic that allows us to efficiently search the space of modular serial manipulator designs and shows that the algorithm is more computationally efficient in determining robot designs for given tasks in comparison to the current state-of-the-art.

Joint Optimization of Robot Design and Motion Parameters using the Implicit Function Theorem

A novel computational approach to optimizing the morphological design of robots, which finds that the complex relationship between design and motion parameters can be established via sensitivity analysis if the robot’s movements are modeled as spatio-temporal solutions to optimal control problems.

State-Only Imitation Learning for Dexterous Manipulation

This paper trains an inverse dynamics model and uses it to predict actions for state-only demonstrations and considerably outperforms RL alone, and is able to learn from demonstrations with different dynamics, morphologies, and objects.

Hierarchically Decoupled Imitation for Morphological Transfer

It is shown that incentivizing a complex agent's low-level to imitate a simpler agent'sLow-level significantly improves zero-shot high-level transfer and that KL-regularized training of the high level stabilizes learning and prevents mode-collapse.

Neural Graph Evolution: Towards Efficient Automatic Robot Design

Neural Graph Evolution (NGE) is the first algorithm that can automatically discover kinematically preferred robotic graph structures, such as a fish with two symmetrical flat side-fins and a tail, or a cheetah with athletic front and back legs.

Reinforcement Learning for Improving Agent Design

It is demonstrated that an agent can learn a better structure of its body that is not only better suited for the task, but also facilitates policy learning.

Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity

This paper investigates a modular co-evolution strategy: a collection of primitive agents learns to dynamically self-assemble into composite bodies while also learning to coordinate their behavior to control these bodies.