Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control
@article{Berseth2018ProgressiveRL, title={Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control}, author={Glen Berseth and Kevin Xie and Paul Cernek and Michiel van de Panne}, journal={ArXiv}, year={2018}, volume={abs/1802.04765} }
Deep reinforcement learning has demonstrated increasing capabilities for continuous control problems, including agents that can move with skill and agility through their environment. [] Key Method We extend policy distillation methods to the continuous action setting and leverage this technique to combine expert policies, as evaluated in the domain of simulated bipedal locomotion across different classes of terrain.
Figures and Tables from this paper
48 Citations
Learning Locomotion Skills for Cassie: Iterative Design and Sim-to-Real
- Computer ScienceCoRL
- 2019
An iterative design approach is described and document, which reflects the multiple design iterations of the reward that are often (if not always) needed in practice in practice, which demonstrates the transfer of policies learned in simulation to the physical robot without dynamics randomization.
Iterative Reinforcement Learning Based Design of Dynamic Locomotion Skills for Cassie
- Computer ScienceArXiv
- 2019
This paper proposes a practical method that allows the reward function to be fully redefined on each successive design iteration while limiting the deviation from the previous iteration, and demonstrates the effectiveness of this iterative-design approach on the bipedal robot Cassie.
Robust Recovery Controller for a Quadrupedal Robot using Deep Reinforcement Learning
- Computer ScienceArXiv
- 2019
This paper presents an approach based on model-free Deep Reinforcement Learning to control recovery maneuvers of quadrupedal robots using a hierarchical behavior-based controller that manifests dynamic and reactive recovery behaviors to recover from an arbitrary fall configuration within less than 5 seconds.
Self-Imitation Learning of Locomotion Movements through Termination Curriculum
- Computer ScienceMIG
- 2019
A novel combination of techniques for accelerating the learning of stable locomotion movements through self-imitation learning of synthetic animations using a novel curriculum learning approach called Termination Curriculum (TC), that adapts the episode termination threshold over time.
A New Framework for Multi-Agent Reinforcement Learning - Centralized Training and Exploration with Decentralized Execution via Policy Distillation
- Computer ScienceAAMAS
- 2020
A new framework known as centralized training and exploration with decentralized execution via policy distillation is proposed, guided by this framework and the maximum-entropy learning technique, which can achieve significantly better performance and higher sample efficiency than a cutting-edge baseline on several multi-agent DRL benchmarks.
Learning to Walk via Deep Reinforcement Learning
- Computer ScienceRobotics: Science and Systems
- 2019
A sample-efficient deep RL algorithm based on maximum entropy RL that requires minimal per-task tuning and only a modest number of trials to learn neural network policies is proposed and achieves state-of-the-art performance on simulated benchmarks with a single set of hyperparameters.
Kickstarting Deep Reinforcement Learning
- Computer ScienceArXiv
- 2018
It is shown that, on a challenging and computationally-intensive multi-task benchmark (DMLab-30), kickstarted training improves the data efficiency of new agents, making it significantly easier to iterate on their design.
Learning to Walk in the Real World with Minimal Human Effort
- Computer ScienceCoRL
- 2020
This paper develops a system for learning legged locomotion policies with deep RL in the real world with minimal human effort by developing a multi-task learning procedure, an automatic reset controller, and a safety-constrained RL framework.
Continual Model-Based Reinforcement Learning with Hypernetworks
- Computer Science2021 IEEE International Conference on Robotics and Automation (ICRA)
- 2021
HyperCRL, a method that continually learns the encountered dynamics in a sequence of tasks using task-conditional hypernetworks, outperforms existing continual learning alternatives that rely on fixed-capacity networks, and does competitively with baselines that remember an ever increasing coreset of past experience.
GLiDE: Generalizable Quadrupedal Locomotion in Diverse Environments with a Centroidal Model
- EngineeringWAFR
- 2022
This work explores how RL can be effectively used with a centroidal model to generate robust control policies for quadrupedal locomotion and shows the potential of the method by demonstrating stepping-stone locomotion, twolegged in-place balance, balance beam locomotion; and sim-toreal transfer without further adaptations.
References
SHOWING 1-10 OF 33 REFERENCES
DeepLoco: dynamic locomotion skills using hierarchical deep reinforcement learning
- EngineeringACM Trans. Graph.
- 2017
This paper aims to learn a variety of environment-aware locomotion skills with a limited amount of prior knowledge by adopting a two-level hierarchical control framework and training both levels using deep reinforcement learning.
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
- Computer ScienceICLR
- 2016
This work defines a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains, and uses Atari games as a testing environment to demonstrate these methods.
Learning locomotion skills using DeepRL: does the choice of action space matter?
- Computer ScienceSymposium on Computer Animation
- 2017
It is demonstrated that the local feedback provided by higher-level action parameterizations can significantly impact the learning, robustness, and motion quality of the resulting policies.
Policy Distillation
- Computer ScienceICLR
- 2016
A novel method called policy distillation is presented that can be used to extract the policy of a reinforcement learning agent and train a new network that performs at the expert level while being dramatically smaller and more efficient.
The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract)
- Computer ScienceIJCAI
- 2013
The promise of ALE is illustrated by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning, and an evaluation methodology made possible by ALE is proposed.
Learning human behaviors from motion capture by adversarial imitation
- Computer ScienceArXiv
- 2017
Generative adversarial imitation learning is extended to enable training of generic neural network policies to produce humanlike movement patterns from limited demonstrations consisting only of partially observed state features, without access to actions, even when the demonstrations come from a body with different and unknown physical parameters.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation
- Computer ScienceNIPS
- 2016
h-DQN is presented, a framework to integrate hierarchical value functions, operating at different temporal scales, with intrinsically motivated deep reinforcement learning, and allows for flexible goal specifications, such as functions over entities and relations.
Distral: Robust multitask reinforcement learning
- Computer ScienceNIPS
- 2017
This work proposes a new approach for joint training of multiple tasks, which it refers to as Distral (Distill & transfer learning), and shows that the proposed learning process is more robust and more stable---attributes that are critical in deep reinforcement learning.
Learning and Transfer of Modulated Locomotor Controllers
- Computer ScienceArXiv
- 2016
A novel architecture and training procedure for locomotion tasks where a monolithic end-to-end architecture fails completely, learning with a pre-trained spinal module succeeds at multiple high-level tasks, and enables the effective exploration required to learn from sparse rewards.
A Deep Hierarchical Approach to Lifelong Learning in Minecraft
- Computer ScienceAAAI
- 2017
We propose a lifelong learning system that has the ability to reuse and transfer knowledge from one task to another while efficiently retaining the previously learned knowledge-base. Knowledge is…