# Regret-Optimal Full-Information Control

@article{Sabag2021RegretOptimalFC, title={Regret-Optimal Full-Information Control}, author={Oron Sabag and Gautam Goel and Sahin Lale and Babak Hassibi}, journal={ArXiv}, year={2021}, volume={abs/2105.01244} }

We consider the infinite-horizon, discrete-time fullinformation control problem. Motivated by learning theory, as a criterion for controller design we focus on regret, defined as the difference between the LQR cost of a causal controller (that has only access to past and current disturbances) and the LQR cost of a clairvoyant one (that has also access to future disturbances). In the full-information setting, there is a unique optimal non-causal controller that in terms of LQR cost dominates all…

## 11 Citations

### Regret-optimal Estimation and Control

- Computer Science, MathematicsArXiv
- 2021

This work shows that the regret-optimal estimators and controllers can be derived in state-space form using operator-theoretic techniques from robust control and presents tight, data-dependent bounds on the regret incurred by the algorithms in terms of the energy of the disturbances.

### Regret-Optimal Filtering

- Computer ScienceAISTATS
- 2021

The regret-optimal estimator is the causal estimator that minimizes the worst-case regret across all bounded-energy noise sequences and is represented as a finite-dimensional state-space whose parameters can be computed by solving three Riccati equations and a single Lyapunov equation.

### Safe Control with Minimal Regret

- Computer ScienceL4DC
- 2022

This paper presents ancient optimization-based approach for computing a robustly safe control policy that minimizes dynamic regret, in the sense of the loss relative to the optimal sequence of control actions selected in hindsight by a clairvoyant controller.

### Implications of Regret on Stability of Linear Dynamical Systems

- Computer ScienceArXiv
- 2022

It is shown that for linear state feedback policies and linear systems subject to adversarial disturbances, linear regret implies asymptotic stability in both time-varying and time-invariant settings, and that bounded input bounded state (BIBS) stability and summability of the state transition matrices imply linear regret.

### Optimal Competitive-Ratio Control

- Computer ScienceArXiv
- 2022

An interesting relation is revealed between the explicit solutions that now exist for both competitive control paradigms by formulating a regret-optimal control framework with weight functions that can also be utilized for practical purposes.

### A System Level Approach to Regret Optimal Control

- Computer Science, MathematicsIEEE Control Systems Letters
- 2022

An optimisation-based method for synthesising a dynamic regret optimal controller for linear systems with potentially adversarial disturbances and known or adversarial initial conditions is presented and the proposed framework allows guaranteeing state and input constraint satisfaction.

### Online estimation and control with optimal pathlength regret

- Computer ScienceL4DC
- 2022

The key idea in the derivation is to reduce pathlength-optimal filtering and control to certain variational problems in robust estimation and control in robust dynamical systems; these reductions may be of independent interest.

### Thompson Sampling Achieves $\tilde{O}(\sqrt{T})$ Regret in Linear Quadratic Control

- Computer ScienceCOLT
- 2022

It is shown that TS achieves order-optimal regret in adaptive control of multidimensional stabilizable LQRs by carefully prescribing an early exploration strategy and a policy update rule, thereby solving the open problem posed in Abeille and Lazaric (2018.

### Regret-Optimal Filtering for Prediction and Estimation

- Computer ScienceIEEE Transactions on Signal Processing
- 2022

Numerical simulations demonstrate that regret minimization inherently interpolates between the performances of the designed filter with that of a clairvoyant filter and is thus a viable approach for filter design.

### Supplementary Materials for “ Regret-Optimal Filtering ”

- Mathematics
- 2021

The main objective of the Supplementary Materials file is to provide detailed derivation and proofs for the main results in the paper. It has three main sections. In the first section, we present the…

## References

SHOWING 1-10 OF 22 REFERENCES

### Regret-Optimal Controller for the Full-Information Problem

- Mathematics, Computer Science2021 American Control Conference (ACC)
- 2021

The regret-optimal control problem can be reduced to a Nehari extension problem, i.e., to approximate an anticausal operator with a causal one in the operator norm, andSimulations over a range of plants demonstrates that the regret- optimal controller interpolates nicely between the H2 and the H∞ optimal controllers, and generally has H1 and H2 costs that are simultaneously close to their optimal values.

### Regret-optimal measurement-feedback control

- MathematicsL4DC
- 2021

It is shown that in the measurement-feedback setting, unlike in the full-information setting, there is no single offline controller which outperforms every other offline controller on every disturbance, and a new $H_2$-optimal offline controller is proposed as a benchmark for the online controller to compete against.

### Regret-optimal control in dynamic environments

- Mathematics, Computer ScienceArXiv
- 2020

The structure of the regret-optimal online controller is derived via a novel reduction to H_∞ control and a clean data-dependent bound on its regret is presented and numerical simulations confirm that the controller significantly outperforms the H₂ and H∞ controllers in dynamic environments.

### Regret-Optimal Filtering

- Computer ScienceAISTATS
- 2021

The regret-optimal estimator is the causal estimator that minimizes the worst-case regret across all bounded-energy noise sequences and is represented as a finite-dimensional state-space whose parameters can be computed by solving three Riccati equations and a single Lyapunov equation.

### The Power of Linear Controllers in LQR Control

- Computer ScienceArXiv
- 2020

The Linear Quadratic Regulator framework considers the problem of regulating a linear dynamical system perturbed by environmental noise and fully characterize the optimal offline policy and shows that it has a recursive form in terms of the optimal online policy and future disturbances.

### Regret Bounds for the Adaptive Control of Linear Quadratic Systems

- Computer Science, MathematicsCOLT
- 2011

The construction of the condence set is based on the recent results from online least-squares estimation and leads to improved worst-case regret bound for the proposed algorithm, and is the the rst time that a regret bound is derived for the LQ control problem.

### Explore More and Improve Regret in Linear Quadratic Regulators

- Computer Science, MathematicsArXiv
- 2020

A framework for adaptive control that exploits the characteristics of linear dynamical systems and deploys additional exploration in the early stages of agent-environment interaction to guarantee sooner design of stabilizing controllers is proposed.

### Online Optimal Control with Linear Dynamics and Predictions: Algorithms and Regret Analysis

- Computer ScienceNeurIPS
- 2019

This paper designs online algorithms, Receding Horizon Gradient-based Control (RHGC), that utilize the predictions through finite steps of gradient computations, and provides a fundamental limit of the dynamic regret for any online algorithms by considering linear quadratic tracking problems.

### Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator

- Computer Science, MathematicsNeurIPS
- 2018

This work presents the first provably polynomial time algorithm that provides high probability guarantees of sub-linear regret on this problem of adaptive control of the Linear Quadratic Regulator, where an unknown linear system is controlled subject to quadratic costs.

### Logarithmic Regret for Online Control

- Computer ScienceNeurIPS
- 2019

It is shown that the optimal regret in this fundamental setting can be significantly smaller, scaling as polylog(T), achieved by two different efficient iterative methods, online gradient descent and online natural gradient.