# Model-Based Active Exploration

@article{Shyam2019ModelBasedAE, title={Model-Based Active Exploration}, author={Pranav Shyam and Wojciech Jaśkowski and Faustino J. Gomez}, journal={ArXiv}, year={2019}, volume={abs/1810.12162} }

Efficient exploration is an unsolved problem in Reinforcement Learning which is usually addressed by reactively rewarding the agent for fortuitously encountering novel situations. This paper introduces an efficient active exploration algorithm, Model-Based Active eXploration (MAX), which uses an ensemble of forward models to plan to observe novel events. This is carried out by optimizing agent behaviour with respect to a measure of novelty derived from the Bayesian perspective of exploration… Expand

#### Figures, Tables, and Topics from this paper

#### 82 Citations

Receding Horizon Curiosity

- Computer Science, Mathematics
- CoRL
- 2019

An effective trajectory-optimization-based approximate solution of this otherwise intractable problem that models optimal exploration in an unknown Markov decision process (MDP) by interleaving episodic exploration with Bayesian nonlinear system identification. Expand

Reinforcement Learning through Active Inference

- Computer Science, Engineering
- ArXiv
- 2020

The central tenet of reinforcement learning (RL) is that agents seek to maximize the sum of cumulative rewards. In contrast, active inference, an emerging framework within cognitive and computational… Expand

ACTIVE INFERENCE

- 2020

The central tenet of reinforcement learning (RL) is that agents seek to maximize the sum of cumulative rewards. In contrast, active inference, an emerging framework within cognitive and computational… Expand

Sample Efficient Reinforcement Learning via Model-Ensemble Exploration and Exploitation

- Computer Science
- 2021 IEEE International Conference on Robotics and Automation (ICRA)
- 2021

MEEE is presented, a model-ensemble method that consists of optimistic exploration and weighted exploitation that outperforms other model-free and model-based state-of-the-art methods, especially in sample complexity. Expand

Explicit Explore-Exploit Algorithms in Continuous State Spaces

- Computer Science, Mathematics
- NeurIPS
- 2019

It is shown that under realizability and optimal planning assumptions, the algorithm provably finds a near-optimal policy with a number of samples that is polynomial in a structural complexity measure which is shown to be low in several natural settings. Expand

MADE: Exploration via Maximizing Deviation from Explored Regions

- Computer Science, Mathematics
- ArXiv
- 2021

This work proposes a new exploration approach via maximizing the deviation of the occupancy of the next policy from the explored regions, giving rise to a new intrinsic reward that adjusts existing bonuses. Expand

Scaling Active Inference

- Computer Science, Engineering
- 2020 International Joint Conference on Neural Networks (IJCNN)
- 2020

This work presents a working implementation of active inference that applies to high-dimensional tasks, with proof-of-principle results demonstrating efficient exploration and an order of magnitude increase in sample efficiency over strong model-free baselines. Expand

SAMBA: Safe Model-Based & Active Reinforcement Learning

- Computer Science, Mathematics
- ArXiv
- 2020

In this paper, we propose SAMBA, a novel framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics. Our method builds upon PILCO… Expand

Planning to Explore via Self-Supervised World Models

- Computer Science, Mathematics
- ICML
- 2020

Without any training supervision or task-specific interaction, Plan2Explore outperforms prior self-supervised exploration methods, and in fact, almost matches the performances oracle which has access to rewards. Expand

Self-Supervised Exploration via Disagreement

- Computer Science, Mathematics
- ICML
- 2019

This paper proposes a formulation for exploration inspired by the work in active learning literature and trains an ensemble of dynamics models and incentivizes the agent to explore such that the disagreement of those ensembles is maximized, which results in a sample-efficient exploration. Expand

#### References

SHOWING 1-10 OF 49 REFERENCES

Efficient Exploration in Reinforcement Learning

- Computer Science
- Encyclopedia of Machine Learning and Data Mining
- 2017

Exploration is a key aspect of reinforcement learning which is missing from standard supervised learning settings and minimizing the number of information gathering actions helps optimize the standard goal in reinforcement learning. Expand

VIME: Variational Information Maximizing Exploration

- Computer Science, Mathematics
- NIPS
- 2016

VIME is introduced, an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics which efficiently handles continuous state and action spaces and can be applied with several different underlying RL algorithms. Expand

Self-Supervised Exploration via Disagreement

- Computer Science, Mathematics
- ICML
- 2019

This paper proposes a formulation for exploration inspired by the work in active learning literature and trains an ensemble of dynamics models and incentivizes the agent to explore such that the disagreement of those ensembles is maximized, which results in a sample-efficient exploration. Expand

An information-theoretic approach to curiosity-driven reinforcement learning

- Computer Science, Medicine
- Theory in Biosciences
- 2011

It is shown that Boltzmann-style exploration, one of the main exploration methods used in reinforcement learning, is optimal from an information-theoretic point of view, in that it optimally trades expected return for the coding cost of the policy. Expand

Unifying Count-Based Exploration and Intrinsic Motivation

- Computer Science
- NIPS
- 2016

This work uses density models to measure uncertainty, and proposes a novel algorithm for deriving a pseudo-count from an arbitrary density model, which enables this technique to generalize count-based exploration algorithms to the non-tabular case. Expand

Explorations in efficient reinforcement learning

- Computer Science
- 1999

Reinforcement learning methods are described which can solve sequential decision making problems by learning from trial and error and different categories of problems are described and new methods for solving them are introduced. Expand

Intrinsically motivated model learning for developing curious robots

- Computer Science
- Artif. Intell.
- 2017

Experiments show that combining the agent's intrinsic rewards with external task rewards enables the agent to learn faster than using external rewards alone, and the applicability of this approach to learning on robots is presented. Expand

Deep Exploration via Bootstrapped DQN

- Computer Science, Mathematics
- NIPS
- 2016

Efficient exploration in complex environments remains a major challenge for reinforcement learning. We propose bootstrapped DQN, a simple algorithm that explores in a computationally and… Expand

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

- Computer Science, Mathematics
- NeurIPS
- 2018

This paper proposes a new algorithm called probabilistic ensembles with trajectory sampling (PETS) that combines uncertainty-aware deep network dynamics models with sampling-based uncertainty propagation, which matches the asymptotic performance of model-free algorithms on several challenging benchmark tasks, while requiring significantly fewer samples. Expand

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

- Computer Science
- ICML
- 2011

PILCO reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way by learning a probabilistic dynamics model and explicitly incorporating model uncertainty into long-term planning. Expand