Effective reinforcement learning through evolutionary surrogate-assisted prescription

@article{Francon2020EffectiveRL,
  title={Effective reinforcement learning through evolutionary surrogate-assisted prescription},
  author={Olivier Francon and Santiago Gonzalez and Babak Hodjat and Elliot Meyerson and Risto Miikkulainen and Xin Qiu and Hormoz Shahrzad},
  journal={Proceedings of the 2020 Genetic and Evolutionary Computation Conference},
  year={2020}
}
There is now significant historical data available on decision making in organizations, consisting of the decision problem, what decisions were made, and how desirable the outcomes were. Using this data, it is possible to learn a surrogate model, and with that model, evolve a decision strategy that optimizes the outcomes. This paper introduces a general such approach, called Evolutionary Surrogate-Assisted Prescription, or ESP. The surrogate is, for example, a random forest or a neural network… 

Figures from this paper

Optimal Agent Search Using Surrogate-Assisted Genetic Algorithms

This study proposes surrogate-assisted genetic algorithms (SGAs), whose surrogate models are used in the fitness evaluation of genetic algorithms, and the surrogates also predict cumulative rewards for an agent’s DNN parameters.

A novelty-search-based evolutionary reinforcement learning algorithm for continuous optimization problems

A novelty search is integrated in the framework of the ERL algorithm, and it guides the agent or population to visit state space where it has rarely or never visited.

From Prediction to Prescription: Evolutionary Optimization of Nonpharmaceutical Interventions in the COVID-19 Pandemic

This article demonstrates how evolutionary AI can be used to facilitate the next step, i.e., determining most effective intervention strategies automatically, in dealing with COVID-19 as well as possible future pandemics.

EVOTER: Evolution of Transparent Explainable Rule-sets

This paper advocates an alternative approach where the models are transparent and explainable to begin with, EVOTER, which evolves rule-sets based on simple logical expressions that form a promising foundation for building trustworthy AI systems for real-world applications in the future.

From Prediction to Prescription: AI-Based Optimization of Non-Pharmaceutical Interventions for the COVID-19 Pandemic

This paper demonstrates how evolutionary AI could be used to facilitate the next step, i.e. determining most effective intervention strategies automatically, in determining how the COVID-19 pandemic spreads, and suggests creative ways in which restrictions can be implemented softly.

Evolution of Transparent Explainable Rule-sets

This paper advocates an alternative approach where the models are transparent and explainable to begin with, EVOTER, which evolves rule-sets based on simple logical expressions that form a promising foundation for building trustworthy AI systems for real-world applications in the future.

Simple genetic operators are universal approximators of probability distributions (and other advantages of expressive encodings)

The conclusion is that, across evolutionary computation areas as diverse as genetic programming, neuroevolution, genetic algorithms, and theory, expressive encodings can be a key to understanding and realizing the full power of evolution.

Creative AI Through Evolutionary Computation

  • R. Miikkulainen
  • Computer Science
    Evolution in Action: Past, Present and Future
  • 2020
The main power of artificial intelligence is not in modeling what we already know, but in creating solutions that are new. Such solutions exist in extremely large, high-dimensional, and complex

Evolution of neural networks

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial

References

SHOWING 1-10 OF 57 REFERENCES

A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation

A generalized version of the Bellman equation is proposed to learn a single parametric representation for optimal policies over the space of all possible preferences in MORL, with the goal of enabling few-shot adaptation to new tasks.

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

This work explores the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an alternative to popular MDP-based RL techniques such as Q-learning and Policy Gradients, and highlights several advantages of ES as a blackbox optimization technique.

Evolved Policy Gradients

Empirical results show that the evolved policy gradient algorithm (EPG) achieves faster learning on several randomized environments compared to an off-the-shelf policy gradient method, and its learned loss can generalize to out-of-distribution test time tasks, and exhibits qualitatively different behavior from other popular metalearning algorithms.

Multiobjective Reinforcement Learning: A Comprehensive Overview

The basic architecture, research topics, and naïve solutions of MORL are introduced at first and several representative MORL approaches and some important directions of recent research are comprehensively reviewed.

Proximal Policy Optimization Algorithms

We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective

Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

It is shown that combining DNNs with novelty search, which was designed to encourage exploration on tasks with deceptive or sparse reward functions, can solve a high-dimensional problem on which reward-maximizing algorithms fail, and expands the sense of the scale at which GAs can operate.

High-Dimensional Continuous Control Using Generalized Advantage Estimation

This work addresses the large number of samples typically required and the difficulty of obtaining stable and steady improvement despite the nonstationarity of the incoming data by using value functions to substantially reduce the variance of policy gradient estimates at the cost of some bias.

Quantifying Point-Prediction Uncertainty in Neural Networks via Residual Estimation with an I/O Kernel

A new framework (RIO) is developed that makes it possible to estimate uncertainty in any pretrained standard NN without modifications to model architecture or training pipeline, and provides an important ingredient for building real-world NN applications.

Evolutionary Computation for Reinforcement Learning

Research is surveyed on the application of evolutionary computation to reinforcement learning, overviewing methods for evolving neural-network topologies and weights, hybrid methods that also use temporal-difference methods, coevolutionary methods for multi-agent settings, generative and developmental systems, and methods for on-line evolutionary reinforcement learning.

Human-level control through deep reinforcement learning

This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
...