# Safe Adaptive Learning-based Control for Constrained Linear Quadratic Regulators with Regret Guarantees

@article{Li2021SafeAL, title={Safe Adaptive Learning-based Control for Constrained Linear Quadratic Regulators with Regret Guarantees}, author={Yingying Li and Subhro Das and Jeff S. Shamma and N. Li}, journal={ArXiv}, year={2021}, volume={abs/2111.00411} }

We study the adaptive control of an unknown linear system with a quadratic cost function subject to safety constraints on both the states and actions. The challenges of this problem arise from the tension among safety, exploration, performance, and computation. To address these challenges, we propose a polynomial-time algorithm that guarantees feasibility and constraint satisfaction with high probability under proper conditions. Our algorithm is implemented on a single trajectory and does not…

## Figures from this paper

## 5 Citations

### Safe Control with Minimal Regret

- Computer ScienceL4DC
- 2022

This paper presents an efficient optimization-based approach for computing a finite-horizon robustly safe control policy that minimizes dynamic regret, in the sense of the loss relative to the optimal sequence of control actions selected in hindsight by a clairvoyant controller.

### Learning-Based Adaptive Control for Stochastic Linear Systems with Input Constraints

- Mathematics, Computer ScienceArXiv
- 2022

—We propose a certainty-equivalence scheme for adaptive control of scalar linear systems subject to additive, i.i.d. Gaussian disturbances and bounded control input con- straints, without requiring…

### Safe Perception-Based Control with Minimal Worst-Case Dynamic Regret

- Computer Science
- 2022

A control algorithm is introduced that minimizes dynamic regret, i.e. the suboptimality against an optimal clairvoyant controller that knows the unpredictable future a priori, that enables safe control of linear time-varying systems in the presence of unknown and unpredictable process and measurement noise.

### A System Level Approach to Regret Optimal Control

- Computer Science, MathematicsIEEE Control Systems Letters
- 2022

An optimisation-based method for synthesising a dynamic regret optimal controller for linear systems with potentially adversarial disturbances and known or adversarial initial conditions is presented and the proposed framework allows guaranteeing state and input constraint satisfaction.

### On the Optimal Control of Network LQR with Spatially-Exponential Decaying Structure

- Mathematics
- 2022

—This paper studies network LQR problems with system matrices being spatially-exponential decaying (SED) between nodes in the network. The major objective is to study whether the optimal controller…

## References

SHOWING 1-10 OF 53 REFERENCES

### Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator

- Computer Science, MathematicsNeurIPS
- 2018

This work presents the first provably polynomial time algorithm that provides high probability guarantees of sub-linear regret on this problem of adaptive control of the Linear Quadratic Regulator, where an unknown linear system is controlled subject to quadratic costs.

### Online Optimal Control with Affine Constraints

- Computer ScienceAAAI
- 2021

Theoretically, it is shown that OGD-BZ with proper parameters can guarantee the system to satisfy all the constraints despite any admissible disturbances, and the policy regret is investigated, which is shown to be square root of the horizon length multiplied by some logarithmic terms ofThe horizon length under proper algorithm parameters.

### Robust-Adaptive Control of Linear Systems: beyond Quadratic Costs

- Computer Science, MathematicsNeurIPS
- 2020

This work considers the problem of robust and adaptive model predictive control of a linear system, with unknown parameters that are learned along the way, in a critical setting where failures must be prevented, and provides the first end-to-end suboptimality analysis for this setting.

### Logarithmic Regret for Online Control

- Computer ScienceNeurIPS
- 2019

It is shown that the optimal regret in this fundamental setting can be significantly smaller, scaling as polylog(T), achieved by two different efficient iterative methods, online gradient descent and online natural gradient.

### Regret Guarantees for Online Receding Horizon Control

- MathematicsArXiv
- 2020

It is shown that the learning based receding horizon control policy achieves the regret of $O(T^{3/4})$ for both the controller's cost and cumulative constraint violation w.r.t the baseline recedingizons control policy that has full knowledge of the system.

### A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems

- Computer ScienceIEEE Transactions on Automatic Control
- 2019

A general safety framework based on Hamilton–Jacobi reachability methods that can work in conjunction with an arbitrary learning algorithm is proposed, which proves theoretical safety guarantees combining probabilistic and worst-case analysis and demonstrates the proposed framework experimentally on a quadrotor vehicle.

### Performance and safety of Bayesian model predictive control: Scalable model-based RL with guarantees

- Computer ScienceArXiv
- 2020

This work proposes a cautious model-based reinforcement learning algorithm that can be efficiently implemented in the form of a standard MPC controller and bound the expected number of unsafe learning episodes using an exact penalty soft-constrained MPC formulation.

### End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

- Computer ScienceAAAI
- 2019

This work proposes a controller architecture that combines a model-free RL-based controller with model-based controllers utilizing control barrier functions (CBFs) and on-line learning of the unknown system dynamics, in order to ensure safety during learning.

### Adaptive MPC under Time Varying Uncertainty: Robust and Stochastic

- MathematicsArXiv
- 2019

This paper deals with the problem of formulating an adaptive Model Predictive Control strategy for constrained uncertain systems, and robustly satisfies the imposed constraints for all possible values of the offset uncertainty in the Feasible Parameter Set.

### Geometric Exploration for Online Control

- Mathematics, Computer ScienceNeurIPS
- 2020

The main component of the algorithm is a novel geometric exploration strategy: it adaptively construct a sequence of barycentric spanners in the policy space, which improves upon the previous best known bound of $T^{2/3}$.