Corpus ID: 208139268

Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning

  title={Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning},
  author={K. Luck and H. B. Amor and R. Calandra},
Humans and animals are capable of quickly learning new behaviours to solve new tasks. Yet, we often forget that they also rely on a highly specialized morphology that co-adapted with motor control throughout thousands of years. Although compelling, the idea of co-adapting morphology and behaviours in robots is often unfeasible because of the long manufacturing times, and the need to re-design an appropriate controller for each morphology. In this paper, we propose a novel approach to… Expand
Task-Agnostic Morphology Evolution
Without any task or reward specification, TAME evolves morphologies by only applying randomly sampled action primitives on a population of agents using an information-theoretic objective that efficiently ranks agents by their ability to reach diverse states in the environment and the causality of their actions. Expand
Embodied Intelligence via Learning and Evolution
Deep Evolutionary Reinforcement Learning (DERL) is introduced: a novel computational framework which can evolve diverse agent morphologies to learn challenging locomotion and manipulation tasks in complex environments using only low level egocentric sensory information. Expand
Emergent Hand Morphology and Control from Optimizing Robust Grasps of Diverse Objects
This work develops a novel Bayesian Optimization algorithm that efficiently co-designs the morphology and grasping skills jointly through learned latentspace representations and demonstrates the effectiveness of this approach in discovering robust and cost-efficient hand morphologies for grasping novel objects. Expand
Neural fidelity warping for efficient robot morphology design
The proposed fidelity warping mechanism can learn representations of learning epochs and tasks to model non-stationary covariances between continuous fidelity evaluations which prove challenging for off-the-shelf stationary kernels. Expand
Efficient Hyperparameter Optimization for Physics-based Character Animation
This work proposes a novel Curriculum-based Multi-Fidelity Bayesian Optimization framework (CMFBO) for efficient hyperparameter optimization of DRL-based character control systems using curriculum-based task difficulty as fidelity criterion and shows that hyperparameters optimized through the algorithm result in at least 5x efficiency gain comparing to author-released settings in DeepMimic. Expand
Hammers for Robots: Designing Tools for Reinforcement Learning Agents
Taking a user-centered design (UCD) approach, the potential of a human, instead of an algorithm, redesigning the agent’s tool is explored, including what it means to understand an RL agent's experience, beliefs, tendencies, and goals. Expand
An End-to-End Differentiable Framework for Contact-Aware Robot Design
An end-to-end differentiable framework for contactaware robot design that allows for the design of articulated rigid robots with arbitrary, complex geometry and a differentiable rigid body simulator that can handle contact-rich scenarios and computes analytical gradients for a full spectrum of kinematic and dynamic parameters. Expand
Co-designing hardware and control for robot hands
Policy gradient methods can be used for mechanical and computational co-design of robot manipulators. Policy gradient methods can be used for mechanical and computational co-design of robotExpand
Hardware as Policy: Mechanical and Computational Co-Optimization using Deep Reinforcement Learning
This study proposes to model aspects of the robot's hardware as a "mechanical policy", analogous to and optimized jointly with its computational counterpart, and shows that, by modeling such mechanical policies as auto-differentiable computational graphs, the ensuing optimization problem can be solved efficiently by gradient-based algorithms from the Policy Optimization family. Expand


Data-efficient Learning of Morphology and Controller for a Microrobot
Experimental results show that HPC-BBO outperforms multiple competitive baselines, and yields a 360% reduction in production cycles over standard Bayesian optimization, thus reducing the hypothetical manufacturing time of the microrobot from 21 to 4 months. Expand
From the Lab to the Desert: Fast Prototyping and Learning of Robot Locomotion
The findings of this study show that static policies developed in the laboratory do not translate to effective locomotion strategies in natural environments, and sample-efficient reinforcement learning can help to rapidly accommodate changes in the environment or the robot. Expand
Real-world evolution adapts robot morphology and control to hardware limitations
This paper applies real world multi-objective evolutionary optimization to optimize both control and morphology of a four-legged mammal-inspired robot, and shows that evolution under the different hardware limitations results in comparable performance for low and moderate speeds. Expand
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
This paper proposes soft actor-critic, an off-policy actor-Critic deep RL algorithm based on the maximum entropy reinforcement learning framework, and achieves state-of-the-art performance on a range of continuous control benchmark tasks, outperforming prior on-policy and off- policy methods. Expand
Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning
The approach maintains a distribution over designs and uses reinforcement learning to optimize a control policy to maximize expected reward over the design distribution, and gives the controller access to design parameters to allow it to tailor its policy to each design in the distribution. Expand
Novelty-Based Evolutionary Design of Morphing Underwater Robots
The approach here adopted represents a novel computed-aided, bioinspired, design paradigm, merging human and artificial creativity, and may result in interesting implications also for artificial life, having the potential to contribute in exploring underwater locomotion "as-it-could-be". Expand
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
This work explores the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an alternative to popular MDP-based RL techniques such as Q-learning and Policy Gradients, and highlights several advantages of ES as a blackbox optimization technique. Expand
Reinforcement Learning for Improving Agent Design
  • David R Ha
  • Computer Science, Mathematics
  • Artificial Life
  • 2019
It is demonstrated that an agent can learn a better structure of its body that is not only better suited for the task, but also facilitates policy learning. Expand
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning
This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units. These algorithms, called REINFORCE algorithms, are shownExpand
A Survey on Policy Search for Robotics
This work classifies model-free methods based on their policy evaluation strategy, policy update strategy, and exploration strategy and presents a unified view on existing algorithms. Expand