REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer
@article{Liu2022REvolveRCE, title={REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer}, author={Xingyu Liu and Deepak Pathak and Kris M. Kitani}, journal={ArXiv}, year={2022}, volume={abs/2202.05244} }
A popular paradigm in robotic learning is to train a policy from scratch for every new robot. This is not only inefficient but also often impractical for complex robots. In this work, we consider the problem of transferring a policy across two different robots with significantly different parameters such as kinematics and morphology. Existing approaches that train a new policy by matching the action or state transition distribution, including imitation learning methods, fail due to optimal…
Figures and Tables from this paper
3 Citations
HERD: Continuous Human-to-Robot Evolution for Learning from Human Demonstration
- Computer ScienceArXiv
- 2022
This work shows that manipulation skills can be transferred from a human to a robot through the use of micro-evolutionary reinforcement learning, and proposes an algorithm for multi-dimensional evolution path searching that allows joint optimization of both the robot evolution path and the policy.
A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation
- Computer ScienceArXiv
- 2022
A method for learning a single policy that manipulates various forms of agents to solve various tasks by distilling a large amount of proficient behavioral data is explored, and the results show that when the policy faces unseen tasks, MTGv2-history may help to improve the performance.
Understanding the Complexity Gains of Single-Task RL with a Curriculum
- Computer ScienceArXiv
- 2022
Under mild regularity conditions on the curriculum, it is shown that sequentially solving each task in the multi-task RL problem is more computationally efficient than solving the original single-task problem, without any explicit exploration bonuses or other exploration strategies.
References
SHOWING 1-10 OF 47 REFERENCES
Hardware Conditioned Policies for Multi-Robot Transfer Learning
- Computer ScienceNeurIPS
- 2018
This work uses the kinematic structure directly as the hardware encoding and shows great zero-shot transfer to completely novel robots not seen during training and demonstrates that fine-tuning the policy network is significantly more sample-efficient than training a model from scratch.
Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning
- Computer Science2019 International Conference on Robotics and Automation (ICRA)
- 2019
The approach maintains a distribution over designs and uses reinforcement learning to optimize a control policy to maximize expected reward over the design distribution, and gives the controller access to design parameters to allow it to tailor its policy to each design in the distribution.
One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control
- Computer ScienceICML
- 2020
It is shown that a single modular policy can successfully generate locomotion behaviors for several planar agents with different skeletal structures such as monopod hoppers, quadrupeds, bipeds, and generalize to variants not seen during training -- a process that would normally require training and manual hyperparameter tuning for each morphology.
Modular Robot Design Synthesis with Deep Reinforcement Learning
- Computer ScienceAAAI
- 2020
This work uses deep reinforcement learning to create a search heuristic that allows us to efficiently search the space of modular serial manipulator designs and shows that the algorithm is more computationally efficient in determining robot designs for given tasks in comparison to the current state-of-the-art.
Joint Optimization of Robot Design and Motion Parameters using the Implicit Function Theorem
- Computer ScienceRobotics: Science and Systems
- 2017
A novel computational approach to optimizing the morphological design of robots, which finds that the complex relationship between design and motion parameters can be established via sensitivity analysis if the robot’s movements are modeled as spatio-temporal solutions to optimal control problems.
State-Only Imitation Learning for Dexterous Manipulation
- Computer Science2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2021
This paper trains an inverse dynamics model and uses it to predict actions for state-only demonstrations and considerably outperforms RL alone, and is able to learn from demonstrations with different dynamics, morphologies, and objects.
Hierarchically Decoupled Imitation for Morphological Transfer
- Computer ScienceICML
- 2020
It is shown that incentivizing a complex agent's low-level to imitate a simpler agent'sLow-level significantly improves zero-shot high-level transfer and that KL-regularized training of the high level stabilizes learning and prevents mode-collapse.
Neural Graph Evolution: Towards Efficient Automatic Robot Design
- Computer ScienceICLR
- 2019
Neural Graph Evolution (NGE) is the first algorithm that can automatically discover kinematically preferred robotic graph structures, such as a fish with two symmetrical flat side-fins and a tail, or a cheetah with athletic front and back legs.
Reinforcement Learning for Improving Agent Design
- Computer ScienceArtificial Life
- 2019
It is demonstrated that an agent can learn a better structure of its body that is not only better suited for the task, but also facilitates policy learning.
Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity
- Computer ScienceNeurIPS
- 2019
This paper investigates a modular co-evolution strategy: a collection of primitive agents learns to dynamically self-assemble into composite bodies while also learning to coordinate their behavior to control these bodies.