Scaling MAP-Elites to deep neuroevolution

@article{Colas2020ScalingMT,
  title={Scaling MAP-Elites to deep neuroevolution},
  author={C{\'e}dric Colas and Joost Huizinga and Vashisht Madhavan and Jeff Clune},
  journal={Proceedings of the 2020 Genetic and Evolutionary Computation Conference},
  year={2020}
}
Quality-Diversity (QD) algorithms, and MAP-Elites (ME) in particular, have proven very useful for a broad range of applications including enabling real robots to recover quickly from joint damage, solving strongly deceptive maze tasks or evolving robot morphologies to discover new gaits. However, present implementations of ME and other QD algorithms seem to be limited to low-dimensional controllers with far fewer parameters than modern deep neural network models. In this paper, we propose to… 

Figures and Tables from this paper

S AMPLE EFFICIENT Q UALITY D IVERSITY FOR NEURAL CONTINUOUS CONTROL
  • Computer Science
  • 2020
We propose a novel Deep Neuroevolution algorithm, QD-RL, that combines the strengths of off-policy reinforcement learning (RL) algorithms and Quality Diversity (QD) approaches to solve continuous
QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning
TLDR
A novel reinforcement learning algorithm that incorporates the strengths of off-policy RL algorithms into Quality Diversity approaches, QD-RL, that can solve challenging exploration and control problems with deceptive rewards while being more than 15 times more sample efficient than its evolutionary counterparts.
CIENT QUALITY-DIVERSITY OPTIMIZATION
  • 2022
Policy gradient assisted MAP-Elites
TLDR
PGA-MAP-Elites is presented, a novel algorithm that enables MAP-Elite to efficiently evolve large neural network controllers by introducing a gradient-based variation operator inspired by Deep Reinforcement Learning.
Diversity Policy Gradient for Sample Efficient Quality-Diversity Optimization
TLDR
A novel algorithm, qd-pg, which combines the strength of Policy Gradient algorithms and Quality Diversity approaches to produce a collection of diverse and high-performing neural policies in continuous control environments and is significantly more sample-efficient than its evolutionary competitors.
Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning
TLDR
One variant achieves results comparable to the current state-of-the-art in combining QD and RL, while the other performs comparably in two locomotion tasks, providing insight into the limitations of current DQD algorithms in domains where gradients must be approximated.
Policy Manifold Search for Improving Diversity-based Neuroevolution
TLDR
This work proposes a novel approach to diversity-based policy search via Neuroevolution, that leverages learned latent representations of the policy parameters which capture the local structure of the data.
Autotelic Agents with Intrinsically Motivated Goal-Conditioned RL: A Short Survey
  • 2022
Intrinsically Motivated Goal-Conditioned Reinforcement Learning: a Short Survey
TLDR
A typology of methods where deep RL algorithms are trained to tackle the developmental robotics problem of the autonomous acquisition of open-ended repertoires of skills is proposed at the intersection of deep RL and developmental approaches.
Language-Conditioned Goal Generation: a New Approach to Language Grounding for RL
TLDR
This paper proposes a particular instantiation of using language to condition goal generators, which allows to decouple sensorimotor learning from language acquisition and enable agents to demonstrate a diversity of behaviors for any given instruction.
...
...

References

SHOWING 1-10 OF 55 REFERENCES
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents
TLDR
This paper shows that algorithms that have been invented to promote directed exploration in small-scale evolved neural networks via populations of exploring agents, specifically novelty search and quality diversity algorithms, can be hybridized with ES to improve its performance on sparse or deceptive deep RL tasks, while retaining scalability.
Illuminating search spaces by mapping elites
TLDR
The Multi-dimensional Archive of Phenotypic Elites (MAP-Elites) algorithm illuminates search spaces, allowing researchers to understand how interesting attributes of solutions combine to affect performance, either positively or, equally of interest, negatively.
Quality and Diversity Optimization: A Unifying Modular Framework
TLDR
A unifying framework of QD optimization algorithms is presented that covers the two main algorithms of this family (multidimensional archive of phenotypic elites and the novelty search with local competition), and that highlights the large variety of variants that can be investigated within this family.
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
TLDR
This work explores the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an alternative to popular MDP-based RL techniques such as Q-learning and Policy Gradients, and highlights several advantages of ES as a blackbox optimization technique.
Exploiting Open-Endedness to Solve Problems Through the Search for Novelty
TLDR
Decoupling the idea of open-ended search from only artificial life worlds, the raw search for novelty can be applied to real world problems and significantly outperforms objective-based search in the deceptive maze navigation task.
Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning
TLDR
It is shown that combining DNNs with novelty search, which was designed to encourage exploration on tasks with deceptive or sparse reward functions, can solve a high-dimensional problem on which reward-maximizing algorithms fail, and expands the sense of the scale at which GAs can operate.
Robots that can adapt like animals
TLDR
An intelligent trial-and-error algorithm is introduced that allows robots to adapt to damage in less than two minutes in large search spaces without requiring self-diagnosis or pre-specified contingency plans, and may shed light on the principles that animals use to adaptation to injury.
Evolving a diversity of virtual creatures through novelty search and local competition
TLDR
The results in an experiment evolving locomoting virtual creatures show that novelty search with local competition discovers more functional morphological diversity within a single run than models with global competition, which are more predisposed to converge.
Understanding the difficulty of training deep feedforward neural networks
TLDR
The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future.
...
...