Corpus ID: 220425135

Adaptive Regret for Control of Time-Varying Dynamics

@article{Gradu2020AdaptiveRF,
  title={Adaptive Regret for Control of Time-Varying Dynamics},
  author={Paula Gradu and Elad Hazan and Edgar Minasyan},
  journal={ArXiv},
  year={2020},
  volume={abs/2007.04393}
}
We consider regret minimization for online control with time-varying linear dynamical systems. The metric of performance we study is adaptive policy regret, or regret compared to the best policy on {\it any interval in time}. We give an efficient algorithm that attains first-order adaptive regret guarantees for the setting of online convex optimization with memory. We also show that these first-order bounds are nearly tight. This algorithm is then used to derive a controller with adaptive… Expand
Regret-optimal Estimation and Control
TLDR
This work shows that the regret-optimal estimators and controllers can be derived in state-space form using operator-theoretic techniques from robust control and presents tight, data-dependent bounds on the regret incurred by the algorithms in terms of the energy of the disturbances. Expand
Stable Online Control of Linear Time-Varying Systems
TLDR
An efficient online control algorithm, COvariance Constrained Online Linear Quadratic (COCO-LQ) control, that guarantees input-to-state stability for a large class of LTV systems while also minimizing the control cost is proposed. Expand
Competitive Control
TLDR
This work designs an online controller which competes against a clairvoyant offline optimal controller and extends competitive control to nonlinear systems using Model Predictive Control (MPC) and presents numerical experiments which show that the competitive controller can significantly outperform standard H2 and H∞ controllers in the MPC setting. Expand
Improving Tractability of Real-Time Control Schemes via Simplified S-Lemma
Various control schemes rely on a solution of a convex optimization problem involving a particular robust quadratic constraint, which can be reformulated as a linear matrix inequality using theExpand
Improving Tractability of Real-Time Control Schemes via Simplified $\mathcal{S}$-Lemma.
TLDR
This work uses some recent advances in robust optimization that allow it to reformulate such a robust constraint as a set of linear and second-order cone constraints, which are computationally better suited to real-time applications. Expand
Generating Adversarial Disturbances for Controller Verification
TLDR
An online learning approach that adaptively generates disturbances based on control inputs chosen by the controller is proposed that competes with the best disturbance generator in hindsight and outperforms several baseline approaches, including $H_{\infty}$ disturbance generation and gradient-based methods. Expand
Deluca - A Differentiable Control Library: Environments, Methods, and Benchmarking
TLDR
This work presents an open-source library of natively differentiable physics and robotics environments, accompanied by gradient-based control methods and a benchmarking suite, and provides a novel differentiable environment, based on deep neural networks, that simulates medical ventilation. Expand
Non-stationary Online Learning with Memory and Non-stochastic Control
TLDR
This paper derives a novel gradient-based controller with dynamic policy regret guarantees, which is the first controller competitive to a sequence of changing policies and applies the results to the problem of online non-stochastic control, i.e., controlling a linear dynamical system with adversarial disturbance and convex loss functions. Expand

References

SHOWING 1-10 OF 49 REFERENCES
Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator
TLDR
This work presents the first provably polynomial time algorithm that provides high probability guarantees of sub-linear regret on this problem of adaptive control of the Linear Quadratic Regulator, where an unknown linear system is controlled subject to quadratic costs. Expand
Regret Bounds for the Adaptive Control of Linear Quadratic Systems
TLDR
The construction of the condence set is based on the recent results from online least-squares estimation and leads to improved worst-case regret bound for the proposed algorithm, and is the the rst time that a regret bound is derived for the LQ control problem. Expand
Regret Bound of Adaptive Control in Linear Quadratic Gaussian (LQG) Systems
TLDR
The regret upper bound of O(√T) for adaptive control of linear quadratic Gaussian (LQG) systems is proved, where T is the time horizon of the problem. Expand
Logarithmic Regret for Online Control
TLDR
It is shown that the optimal regret in this fundamental setting can be significantly smaller, scaling as polylog(T), achieved by two different efficient iterative methods, online gradient descent and online natural gradient. Expand
Regret Minimization in Partially Observable Linear Quadratic Control
TLDR
A novel way to decompose the regret is proposed and an end-to-end sublinear regret upper bound is established for partially observable linear quadratic control systems when the model dynamics are unknown a priori. Expand
Model-Free Linear Quadratic Control via Reduction to Expert Prediction
TLDR
This work presents a new model-free algorithm for controlling linear quadratic (LQ) systems, and shows that its regret scales as $O(T^{\xi+2/3})$ for any small $\xi>0$ if time horizon satisfies $T>. Expand
Towards Provable Control for Unknown Linear Dynamical Systems
Certainty Equivalence is Efficient for Linear Quadratic Control
TLDR
To the best of the knowledge, this result is the first sub-optimality guarantee in the partially observed Linear Quadratic Gaussian (LQG) setting and improves upon recent work by Dean et al. (2017), who present an algorithm achieving a sub- optimality gap linear in the parameter error. Expand
Certainty Equivalent Control of LQR is Efficient
TLDR
The results show that certainty equivalent control with $\varepsilon$-greedy exploration achieves $\tilde{\mathcal{O}}(\sqrt{T})$ regret in the adaptive LQR setting, yielding the first guarantee of a computationally tractable algorithm that achieves nearly optimal regret for adaptive L QR. Expand
Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator
TLDR
This work bridges the gap showing that (model free) policy gradient methods globally converge to the optimal solution and are efficient (polynomially so in relevant problem dependent quantities) with regards to their sample and computational complexities. Expand
...
1
2
3
4
5
...