• Corpus ID: 6085454

Efficient reinforcement learning using Gaussian processes

@inproceedings{Deisenroth2010EfficientRL,
  title={Efficient reinforcement learning using Gaussian processes},
  author={Marc Peter Deisenroth},
  year={2010}
}
This book examines Gaussian processes in both model-based reinforcement learning (RL) and inference in nonlinear dynamic systems. [] Key Method PILCO takes model uncertainties consistently into account during long-term planning to reduce model bias. Second, we propose principled algorithms for robust filtering and smoothing in GP dynamic systems.
Off-policy reinforcement learning with Gaussian processes
TLDR
An off-policy Bayesian nonparameteric approximate reinforcement learning framework that employs a Gaussian processes model of the value function that has competitive learning speed in addition to its convergence guarantees and its ability to automatically choose its own bases locations.
Uncertainty Estimation in Continuous Models applied to Reinforcement Learning
TLDR
This work considers the model-based reinforcement learning framework where it is interested in learning a model and control policy for a given objective and considers modeling the dynamics of an environment using Gaussian Processes or a Bayesian neural network.
Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control
TLDR
This work proposes a model-based RL framework based on probabilistic Model Predictive Control based on Gaussian Processes to incorporate model uncertainty into long-term predictions, thereby, reducing the impact of model errors and provides theoretical guarantees for first-order optimality in the GP-based transition models with deterministic approximate inference for long- term planning.
Online reinforcement learning by Bayesian inference
TLDR
This paper proposes an online reinforcement learning algorithm referred as to Bayesian-SARSA that can sequentially update according to observed variables such as state and reward by Bayesian inference during the policy evaluation.
Off-policy reinforcement learning with Gaussian processes Citation
TLDR
An off-policy Bayesian nonparameteric approximate reinforcement learning framework that employs a Gaussian Processes model of the value function that has competitive learning speeds in addition to its convergence guarantees and its ability to automatically choose its own bases locations.
Model-Based Bayesian Sparse Sampling for Data Efficient Control
TLDR
A novel Bayesian-inspired model-based policy search algorithm for data efficient control that makes use of approximate Gaussian processes in the form of random Fourier features for fast online systems identification and computationally efficient posterior updates via rank one Cholesky updates.
Multi-Fidelity Model-Free Reinforcement Learning with Gaussian Processes
TLDR
A Multi-Fidelity Reinforcement Learning (MFRL) model-free algorithm that leverages Gaussian Processes (GPs) to learn the optimal policy in the real world to reduce the number of learning samples as the agent moves up the simulator chain.
Multi-Fidelity Reinforcement Learning with Gaussian Processes
TLDR
Two versions of Multi-Fidelity Reinforcement Learning (MFRL) are presented, model-based and model-free, that leverage Gaussian Processes (GPs) to learn the optimal policy in a real-world environment.
Gaussian Processes for Data-Efficient Learning in Robotics and Control
TLDR
This paper learns a probabilistic, non-parametric Gaussian process transition model of the system and applies it to autonomous learning in real robot and control tasks, achieving an unprecedented speed of learning.
Online Constrained Model-based Reinforcement Learning
TLDR
This work proposes a model based approach that combines Gaussian Process regression and Receding Horizon Control that results in an agent that can learn and plan in real-time under non-linear constraints and demonstrates the benefits of online learning on an autonomous racing task.
...
...

References

SHOWING 1-10 OF 267 REFERENCES
Gaussian Processes in Reinforcement Learning
TLDR
It is speculated that the intrinsic ability of GP models to characterise distributions of functions would allow the method to capture entire distributions over future values instead of merely their expectation, which has traditionally been the focus of much of reinforcement learning.
A Bayesian Framework for Reinforcement Learning
TLDR
It is proposed that the learning process estimates online the full posterior distribution over models and to determine behavior, a hypothesis is sampled from this distribution and the greedy policy with respect to the hypothesis is obtained by dynamic programming.
Gaussian process dynamic programming
Reinforcement learning with Gaussian processes
TLDR
A SARSA based extension of GPTD is presented, termed GPSARSA, that allows the selection of actions and the gradual improvement of policies without requiring a world-model.
State-Space Inference and Learning with Gaussian Processes
TLDR
A new, general methodology for inference and learning in nonlinear state-space models that are described probabilistically by non-parametric GP models is proposed and the expectation maximization algorithm is applied.
A Bayesian Sampling Approach to Exploration in Reinforcement Learning
TLDR
This work presents a modular approach to reinforcement learning that uses a Bayesian representation of the uncertainty over models and achieves near-optimal reward with high probability with a sample complexity that is low relative to the speed at which the posterior distribution converges during learning.
Model-free off-policy reinforcement learning in continuous environment
  • P. Wawrzynski, A. Pacut
  • Computer Science
    2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541)
  • 2004
TLDR
An algorithm of reinforcement learning in continuous state and action spaces that utilizes the entire history of agent-environment interaction to construct a control policy that is several times shorter than the one required by other algorithms.
An analytic solution to discrete Bayesian reinforcement learning
TLDR
This work proposes a new algorithm, called BEETLE, for effective online learning that is computationally efficient while minimizing the amount of exploration, and takes a Bayesian model-based approach, framing RL as a partially observable Markov decision process.
Learning nonlinear state-space models for control
  • T. Raiko, M. Tornio
  • Computer Science
    Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005.
  • 2005
TLDR
Simulations with a cart-pole swing-up task confirm that the latent state space provides a representation that is easier to predict and control than the original observation space.
...
...