• Corpus ID: 636855

Safe Model-based Reinforcement Learning with Stability Guarantees

@inproceedings{Berkenkamp2017SafeMR,
  title={Safe Model-based Reinforcement Learning with Stability Guarantees},
  author={Felix Berkenkamp and Matteo Turchetta and Angela P. Schoellig and Andreas Krause},
  booktitle={NIPS},
  year={2017}
}
Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. [...] Key Method Specifically, we extend control-theoretic results on Lyapunov stability verification and show how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates. Moreover, under additional regularity assumptions in terms of a Gaussian process prior, we prove that one can effectively and safely collect data in order to learn about the…Expand
The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamic Systems
TLDR
A method to learn accurate safety certificates for nonlinear, closed-loop dynamical systems by constructing a neural network Lyapunov function and a training algorithm that adapts it to the shape of the largest safe region in the state space.
Safe Reinforcement Learning with Stability & Safety Guarantees Using Robust MPC
TLDR
A formal theory detailing how safety and stability can be enforced through the parameter updates delivered by the Reinforcement Learning tools is still lacking and is developed for the generic robust MPC case.
Reinforcement Learning Control of Constrained Dynamic Systems with Uniformly Ultimate Boundedness Stability Guarantee
TLDR
The classic Lyapunov's method is explored to analyze the uniformly ultimate boundedness stability (UUB) solely based on data without using a mathematical model and shows impressive resilience even in the presence of external disturbances.
Learning-Based Model Predictive Control for Safe Exploration
TLDR
This paper presents a learning-based model predictive control scheme that can provide provable high-probability safety guarantees and exploits regularity assumptions on the dynamics in terms of a Gaussian process prior to construct provably accurate confidence intervals on predicted trajectories.
Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles
TLDR
This paper proposes a model-based safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints, and constructs a novel barrier-based control policy structure that can guarantee control safety.
Learning-based Model Predictive Control for Safe Reinforcement Learning
TLDR
This paper combines a provably safe learning-based MPC scheme that allows for input-dependent uncertainties with techniques from model-based RL to solve tasks with only limited prior knowledge and evaluates the resulting algorithm to solve a reinforcement learning task in a simulated cart-pole dynamical system with safety constraints.
Barrier-Certified Adaptive Reinforcement Learning With Applications to Brushbot Navigation
TLDR
A safe learning framework that employs an adaptive model learning algorithm together with barrier certificates for systems with possibly nonstationary agent dynamics, and solutions to the barrier-certified policy optimization are guaranteed to be globally optimal, ensuring the greedy policy improvement under mild conditions.
A Lyapunov-based Approach to Safe Reinforcement Learning
TLDR
This work defines and presents a method for constructing Lyapunov functions, which provide an effective way to guarantee the global safety of a behavior policy during training via a set of local, linear constraints.
Online Policies for Real-Time Control Using MRAC-RL
TLDR
This paper proposes a set of novel MRAC algorithms, applies them to a class of nonlinear systems, derive the associated control laws, provide stability guarantees for the resulting closed-loop system, and shows that the adaptive tracking objective is achieved.
Safe Policy Learning for Continuous Control
TLDR
Safe policy optimization algorithms based on a Lyapunov approach to solve continuous action reinforcement learning problems in which it is crucial that the agent interacts with the environment only through near-safe policies, i.e., policies that keep the agent in desirable situations, both during training and at convergence.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 44 REFERENCES
Lyapunov Design for Safe Reinforcement Learning
TLDR
This work proposes a method for constructing safe, reliable reinforcement learning agents based on Lyapunov design principles that ensures qualitatively satisfactory agent behavior for virtually any reinforcement learning algorithm and at all times, including while the agent is learning and taking exploratory actions.
Reachability-based safe learning with Gaussian processes
TLDR
This work proposes a novel method that uses a principled approach to learn the system's unknown dynamics based on a Gaussian process model and iteratively approximates the maximal safe set and further incorporates safety into the reinforcement learning performance metric, allowing a better integration of safety and learning.
Stability of Controllers for Gaussian Process Forward Models
TLDR
This work provides a stability analysis tool for controllers acting on dynamics represented by Gaussian processes, and considers arbitrary Markovian control policies and system dynamics given as (i) the mean of a GP, and (ii) the full GP distribution.
Safe Exploration in Markov Decision Processes
TLDR
This paper proposes a general formulation of safety through ergodicity, and shows that imposing safety by restricting attention to the resulting set of guaranteed safe policies is NP-hard, and presents an efficient algorithm for guaranteed safe, but potentially suboptimal, exploration.
Constrained Policy Optimization
TLDR
Constrained Policy Optimization (CPO) is proposed, the first general-purpose policy search algorithm for constrained reinforcement learning with guarantees for near-constraint satisfaction at each iteration, and allows us to train neural network policies for high-dimensional control while making guarantees about policy behavior all throughout training.
Provably safe and robust learning-based model predictive control
TLDR
A learning-based model predictive control scheme that provides deterministic guarantees on robustness, while statistical identification tools are used to identify richer models of the system in order to improve performance.
Safe Exploration Techniques for Reinforcement Learning - An Overview
TLDR
This work overviews different approaches to safety in (semi)autonomous robotics and addresses the issues of how to define safety in the real-world applications (apparently absolute safety is unachievable in the continuous and random real world).
Risk-Sensitive Reinforcement Learning Applied to Control under Constraints
TLDR
A model free, heuristic reinforcement learning algorithm that aims at finding good deterministic policies based on weighting the original value function and the risk, which was successfully applied to the control of a feed tank with stochastic inflows that lies upstream of a distillation column.
Safe Exploration of State and Action Spaces in Reinforcement Learning
TLDR
The PI-SRL algorithm is introduced, which safely improves suboptimal albeit robust behaviors for continuous state and action control tasks and which efficiently learns from the experience gained from the environment.
Safe Exploration in Finite Markov Decision Processes with Gaussian Processes
TLDR
A novel algorithm is developed and proved that it is able to completely explore the safely reachable part of the MDP without violating the safety constraint, and is demonstrated on digital terrain models for the task of exploring an unknown map with a rover.
...
1
2
3
4
5
...