• Corpus ID: 226237237

Sample-efficient reinforcement learning using deep Gaussian processes

  title={Sample-efficient reinforcement learning using deep Gaussian processes},
  author={Charles W. L. Gadd and Markus Heinonen and Harri L{\"a}hdesm{\"a}ki and Samuel Kaski},
Reinforcement learning provides a framework for learning to control which actions to take towards completing a task through trial-and-error. In many applications observing interactions is costly, necessitating sample-efficient learning. In model-based reinforcement learning efficiency is improved by learning to simulate the world dynamics. The challenge is that model inaccuracies rapidly accumulate over planned trajectories. We introduce deep Gaussian processes where the depth of the… 
Robust Learning of Physics Informed Neural Networks
Gaussian Process (GP) based smoothing that recovers the performance of a PINN and promises a robust architecture against noise/errors in measurements is introduced and an inexpensive method of quantifying the evolution of uncertainty based on the variance estimation of GPs on boundary data is illustrated.
Learning Optimal Control with Stochastic Models of Hamiltonian Dynamics
A reduced Hamiltonian of the unconstrained Hamiltonian is learned by going backward in time and by minimizing the loss function resulting from application of the Pontryagin maximum principle’s conditions.


Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
This paper proposes a new algorithm called probabilistic ensembles with trajectory sampling (PETS) that combines uncertainty-aware deep network dynamics models with sampling-based uncertainty propagation, which matches the asymptotic performance of model-free algorithms on several challenging benchmark tasks, while requiring significantly fewer samples.
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning
It is demonstrated that neural network dynamics models can in fact be combined with model predictive control (MPC) to achieve excellent sample complexity in a model-based reinforcement learning algorithm, producing stable and plausible gaits that accomplish various complex locomotion tasks.
Improving PILCO with Bayesian Neural Network Dynamics Models
PILCO’s framework is extended to use Bayesian deep dynamics models with approximate variational inference, allowing PILCO to scale linearly with number of trials and observation space dimensionality, and it is shown that moment matching is a crucial simplifying assumption made by the model.
Towards Sample Efficient Reinforcement Learning
The understanding of the problem is shared, possible ways to alleviate the sample cost of reinforcement learning are discussed, from the aspects of exploration, optimization, environment modeling, experience transfer, and abstraction.
Model-Based Reinforcement Learning for Atari
Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models, is described and a comparison of several model architectures is presented, including a novel architecture that yields the best results in the authors' setting.
Gaussian Processes in Reinforcement Learning
It is speculated that the intrinsic ability of GP models to characterise distributions of functions would allow the method to capture entire distributions over future values instead of merely their expectation, which has traditionally been the focus of much of reinforcement learning.
Nonlinear Inverse Reinforcement Learning with Gaussian Processes
A probabilistic algorithm that allows complex behaviors to be captured from suboptimal stochastic demonstrations, while automatically balancing the simplicity of the learned reward structure against its consistency with the observed actions.
PILCO: A Model-Based and Data-Efficient Approach to Policy Search
PILCO reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way by learning a probabilistic dynamics model and explicitly incorporating model uncertainty into long-term planning.
Inference in Deep Gaussian Processes using Stochastic Gradient Hamiltonian Monte Carlo
This work provides evidence for the non-Gaussian nature of the posterior and applies the Stochastic Gradient Hamiltonian Monte Carlo method to generate samples, which results in significantly better predictions at a lower computational cost than its VI counterpart.
Deeper Connections between Neural Networks and Gaussian Processes Speed-up Active Learning
This work proposes to approximate Bayesian neural networks (BNN) by Gaussian processes, which allows us to update the uncertainty estimates of predictions efficiently without retraining the neural network, while avoiding overconfident uncertainty prediction for out-of-sample points.