• Corpus ID: 235377450

Adversarial Tracking Control via Strongly Adaptive Online Learning with Memory

@inproceedings{Zhang2022AdversarialTC,
  title={Adversarial Tracking Control via Strongly Adaptive Online Learning with Memory},
  author={Zhiyu Zhang and Ashok Cutkosky and Ioannis Ch. Paschalidis},
  booktitle={AISTATS},
  year={2022}
}
We consider the problem of tracking an adversarial state sequence in a linear dynamical system subject to adversarial disturbances and loss functions, generalizing earlier settings in the literature. To this end, we develop three techniques, each of independent interest. First, we propose a comparator-adaptive algorithm for online linear optimization with movement cost . Without tuning, it nearly matches the performance of the optimally tuned gradient descent in hindsight. Next, considering a… 

Figures from this paper

Smoothed Online Convex Optimization Based on Discounted-Normal-Predictor
TLDR
It is demonstrated that Discounted-Normal-Predictor can be utilized to yield nearly optimal regret bounds over any interval, even in the presence of switching costs, and developed a simple algorithm for SOCO, which is able to minimize the adaptive regret with switching cost.
Non-stationary Online Learning with Memory and Non-stochastic Control
TLDR
This paper derives a novel gradient-based controller with dynamic policy regret guar-antees, which is the first controller provably competitive to a sequence of changing policies for online non-stochastic control.
Exploiting the Curvature of Feasible Sets for Faster Projection-Free Online Learning
TLDR
An OCO algorithm that makes two calls to an LO Oracle per round and achieves the near-optimal e O ( √ T ) regret whenever the feasible set is strongly convex is presented.
Optimal Parameter-free Online Learning with Switching Cost
TLDR
A simple yet powerful algorithm for Online Linear Optimization (OLO) with switching cost is proposed, which improves the existing suboptimal regret bound [ZCP22a] to the optimal rate.
Adaptive Gradient Methods with Local Guarantees
TLDR
This paper proposes an adaptive gradient method that has provable adaptive regret guarantees vs. the best local preconditioner, and proves a new adaptive regret bound in online learning that improves upon previous adaptive online learning methods.
PDE-Based Optimal Strategy for Unconstrained Online Learning
TLDR
The proposed algorithm is the first to achieve an optimal loss-regret trade-off without the impractical doubling trick and a matching lower bound shows that the leading order term, including the constant multiplier √ 2, is tight.

References

SHOWING 1-10 OF 60 REFERENCES
Non-stationary Online Learning with Memory and Non-stochastic Control
TLDR
This paper derives a novel gradient-based controller with dynamic policy regret guar-antees, which is the first controller provably competitive to a sequence of changing policies for online non-stochastic control.
Logarithmic Regret for Adversarial Online Control
TLDR
A new algorithm for online linear-quadratic control in a known system subject to adversarial disturbances is introduced, giving the first algorithm with logarithmic regret for arbitrary adversarial disturbance sequences, provided the state and control costs are given by known quadratic functions.
Tracking Adversarial Targets
TLDR
An efficient algorithm is presented for linear control problems with quadratic losses and adversarially chosen tracking targets and its regret with respect to an optimal linear policy grows as O(log2T), where T is the number of rounds of the game.
Online Control with Adversarial Disturbances
TLDR
The objective is to desire an online control procedure that can do nearly as well as that of a procedure that has full knowledge of the disturbances in hindsight, and the main result is an efficient algorithm that provides nearly tight regret bounds for this problem.
Online Linear Quadratic Control
TLDR
This work presents the first efficient online learning algorithms in this setting that guarantee regret under mild assumptions, and relies on a novel SDP relaxation for the steady-state distribution of the system.
Online Learning with Adversarial Delays
TLDR
It is shown that online-gradient-descent and follow-the-perturbed-leader achieve regret O(√D) in the delayed setting, where D is the sum of delays of each round's feedback.
Online learning with dynamics: A minimax perspective
TLDR
This work provides a unifying analysis that recovers regret bounds for several well studied problems including online learning with memory, online control of linear quadratic regulators, online Markov decision processes, and tracking adversarial targets.
Online Learning for Adversaries with Memory: Price of Past Mistakes
TLDR
This work extends the notion of learning with memory to the general Online Convex Optimization (OCO) framework, and presents two algorithms that attain low regret.
Online Optimization with Memory and Competitive Control
TLDR
The proposed approach, Optimistic Regularized Online Balanced Descent, achieves a constant, dimension-free competitive ratio and shows a connection between online optimization with memory and online control with adversarial disturbances.
Making Non-Stochastic Control (Almost) as Easy as Stochastic
Recent literature has made much progress in understanding \emph{online LQR}: a modern learning-theoretic take on the classical control problem in which a learner attempts to optimally control an
...
...