# Adaptive Regret for Control of Time-Varying Dynamics

@article{Gradu2020AdaptiveRF, title={Adaptive Regret for Control of Time-Varying Dynamics}, author={Paula Gradu and Elad Hazan and Edgar Minasyan}, journal={ArXiv}, year={2020}, volume={abs/2007.04393} }

We consider regret minimization for online control with time-varying linear dynamical systems. The metric of performance we study is adaptive policy regret, or regret compared to the best policy on {\it any interval in time}. We give an efficient algorithm that attains first-order adaptive regret guarantees for the setting of online convex optimization with memory. We also show that these first-order bounds are nearly tight. This algorithm is then used to derive a controller with adaptive…

## Figures from this paper

## 18 Citations

Regret-optimal Estimation and Control

- Computer Science, MathematicsArXiv
- 2021

This work shows that the regret-optimal estimators and controllers can be derived in state-space form using operator-theoretic techniques from robust control and presents tight, data-dependent bounds on the regret incurred by the algorithms in terms of the energy of the disturbances.

Learning to Control under Time-Varying Environment

- Computer ScienceArXiv
- 2022

This study establishes the first model-based online algorithm with regret guarantees under LTV dynamical systems, based on the optimism in the face of uncertainty (OFU) principle, which optimistically select the best model in a high conﬁdence region.

Online Control of Unknown Time-Varying Dynamical Systems

- Computer Science, MathematicsNeurIPS
- 2021

It is proved a lower bound that no algorithm can obtain sublinear regret with respect to the class of Disturbance Response policies up to the aforementioned system variability term, and it is shown that ofﬂine planning over the state linear feedback policies is NP-hard, suggesting hardness of the online learning problem.

A Regret Minimization Approach to Multi-Agent Contro

- Computer ScienceICML
- 2022

This study focuses on optimal control without centralized precomputed policies, but rather with adaptive control policies for the different agents that are only equipped with a stabilizing controller, giving a reduction from any regret minimizing control method to a distributed algorithm.

Stable Online Control of Linear Time-Varying Systems

- MathematicsL4DC
- 2021

An efficient online control algorithm, COvariance Constrained Online Linear Quadratic (COCO-LQ) control, that guarantees input-to-state stability for a large class of LTV systems while also minimizing the control cost is proposed.

Online estimation and control with optimal pathlength regret

- Computer Science
- 2022

Numerical simulations confirm that the first pathlength regret bounds for online control and estimation (e.g. Kalman filtering) in linear dynamical systems are obtained and reduce pathlength-optimal filtering and control to certain variational problems in robust estimation and control.

Optimal Dynamic Regret in LQR Control

- Computer Science, MathematicsArXiv
- 2022

An efficient online algorithm is provided that achieves an optimal dynamic (policy) regret of Õ(n) dynamic regret on a family of “minibatched” quadratic losses, which could be of independent interest.

Competitive Control

- MathematicsArXiv
- 2021

This work designs an online controller which competes against a clairvoyant offline optimal controller and extends competitive control to nonlinear systems using Model Predictive Control (MPC) and presents numerical experiments which show that the competitive controller can significantly outperform standard H2 and H∞ controllers in the MPC setting.

Non-stationary Online Learning with Memory and Non-stochastic Control

- Computer ScienceAISTATS
- 2022

This paper derives a novel gradient-based controller with dynamic policy regret guar-antees, which is the first controller provably competitive to a sequence of changing policies for online non-stochastic control.

Online Optimization with Feedback Delay and Nonlinear Switching Cost

- Mathematics, Computer ScienceProc. ACM Meas. Anal. Comput. Syst.
- 2022

A novel Iterative Regularized Online Balanced Descent (iROBD) algorithm has a constant, dimension-free competitive ratio that is $O(L^2k )$, where L is the Lipschitz constant of the switching cost.

## References

SHOWING 1-10 OF 71 REFERENCES

Regret-optimal control in dynamic environments

- Mathematics, Computer ScienceArXiv
- 2020

The structure of the regret-optimal online controller is derived via a novel reduction to H_∞ control and a clean data-dependent bound on its regret is presented and numerical simulations confirm that the controller significantly outperforms the H₂ and H∞ controllers in dynamic environments.

Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator

- Computer Science, MathematicsNeurIPS
- 2018

This work presents the first provably polynomial time algorithm that provides high probability guarantees of sub-linear regret on this problem of adaptive control of the Linear Quadratic Regulator, where an unknown linear system is controlled subject to quadratic costs.

Online Optimal Control with Linear Dynamics and Predictions: Algorithms and Regret Analysis

- Computer ScienceNeurIPS
- 2019

This paper designs online algorithms, Receding Horizon Gradient-based Control (RHGC), that utilize the predictions through finite steps of gradient computations, and provides a fundamental limit of the dynamic regret for any online algorithms by considering linear quadratic tracking problems.

Regret Bounds for the Adaptive Control of Linear Quadratic Systems

- Computer Science, MathematicsCOLT
- 2011

The construction of the condence set is based on the recent results from online least-squares estimation and leads to improved worst-case regret bound for the proposed algorithm, and is the the rst time that a regret bound is derived for the LQ control problem.

Regret Bound of Adaptive Control in Linear Quadratic Gaussian (LQG) Systems

- Computer Science, MathematicsArXiv
- 2020

The regret upper bound of O(√T) for adaptive control of linear quadratic Gaussian (LQG) systems is proved, where T is the time horizon of the problem.

Logarithmic Regret for Online Control

- Computer ScienceNeurIPS
- 2019

It is shown that the optimal regret in this fundamental setting can be significantly smaller, scaling as polylog(T), achieved by two different efficient iterative methods, online gradient descent and online natural gradient.

Regret Minimization in Partially Observable Linear Quadratic Control

- Computer Science, MathematicsArXiv
- 2020

A novel way to decompose the regret is proposed and an end-to-end sublinear regret upper bound is established for partially observable linear quadratic control systems when the model dynamics are unknown a priori.

Online Linear Quadratic Control

- Computer Science, MathematicsICML
- 2018

This work presents the first efficient online learning algorithms in this setting that guarantee regret under mild assumptions, and relies on a novel SDP relaxation for the steady-state distribution of the system.

Logarithmic Regret Bound in Partially Observable Linear Dynamical Systems

- Computer Science, MathematicsNeurIPS
- 2020

The first model estimation method with finite-time guarantees in both open and closed-loop system identification and adaptive control online learning (AdaptOn), an efficient reinforcement learning algorithm that adaptively learns the system dynamics and continuously updates its controller through online learning steps.

Model-Free Linear Quadratic Control via Reduction to Expert Prediction

- Computer ScienceAISTATS
- 2019

This work presents a new model-free algorithm for controlling linear quadratic (LQ) systems, and shows that its regret scales as $O(T^{\xi+2/3})$ for any small $\xi>0$ if time horizon satisfies $T>.