• Corpus ID: 235377450

@inproceedings{Zhang2022AdversarialTC,
author={Zhiyu Zhang and Ashok Cutkosky and Ioannis Ch. Paschalidis},
booktitle={AISTATS},
year={2022}
}
• Published in AISTATS 2 February 2021
• Computer Science
We consider the problem of tracking an adversarial state sequence in a linear dynamical system subject to adversarial disturbances and loss functions, generalizing earlier settings in the literature. To this end, we develop three techniques, each of independent interest. First, we propose a comparator-adaptive algorithm for online linear optimization with movement cost . Without tuning, it nearly matches the performance of the optimally tuned gradient descent in hindsight. Next, considering a…

## Figures from this paper

Smoothed Online Convex Optimization Based on Discounted-Normal-Predictor
• Computer Science
ArXiv
• 2022
It is demonstrated that Discounted-Normal-Predictor can be utilized to yield nearly optimal regret bounds over any interval, even in the presence of switching costs, and developed a simple algorithm for SOCO, which is able to minimize the adaptive regret with switching cost.
Non-stationary Online Learning with Memory and Non-stochastic Control
• Computer Science
AISTATS
• 2022
This paper derives a novel gradient-based controller with dynamic policy regret guar-antees, which is the first controller provably competitive to a sequence of changing policies for online non-stochastic control.
Exploiting the Curvature of Feasible Sets for Faster Projection-Free Online Learning
An OCO algorithm that makes two calls to an LO Oracle per round and achieves the near-optimal e O ( √ T ) regret whenever the feasible set is strongly convex is presented.
Optimal Parameter-free Online Learning with Switching Cost
• Computer Science
ArXiv
• 2022
A simple yet powerful algorithm for Online Linear Optimization (OLO) with switching cost is proposed, which improves the existing suboptimal regret bound [ZCP22a] to the optimal rate.
• Computer Science
ArXiv
• 2022
This paper proposes an adaptive gradient method that has provable adaptive regret guarantees vs. the best local preconditioner, and proves a new adaptive regret bound in online learning that improves upon previous adaptive online learning methods.
PDE-Based Optimal Strategy for Unconstrained Online Learning
• Computer Science
ICML
• 2022
The proposed algorithm is the first to achieve an optimal loss-regret trade-oﬀ without the impractical doubling trick and a matching lower bound shows that the leading order term, including the constant multiplier √ 2, is tight.

## References

SHOWING 1-10 OF 60 REFERENCES
Non-stationary Online Learning with Memory and Non-stochastic Control
• Computer Science
AISTATS
• 2022
This paper derives a novel gradient-based controller with dynamic policy regret guar-antees, which is the first controller provably competitive to a sequence of changing policies for online non-stochastic control.
Logarithmic Regret for Adversarial Online Control
• Computer Science, Mathematics
ICML
• 2020
A new algorithm for online linear-quadratic control in a known system subject to adversarial disturbances is introduced, giving the first algorithm with logarithmic regret for arbitrary adversarial disturbance sequences, provided the state and control costs are given by known quadratic functions.
• Computer Science, Mathematics
ICML
• 2014
An efficient algorithm is presented for linear control problems with quadratic losses and adversarially chosen tracking targets and its regret with respect to an optimal linear policy grows as O(log2T), where T is the number of rounds of the game.
• Mathematics, Computer Science
ICML
• 2019
The objective is to desire an online control procedure that can do nearly as well as that of a procedure that has full knowledge of the disturbances in hindsight, and the main result is an efficient algorithm that provides nearly tight regret bounds for this problem.
• Computer Science, Mathematics
ICML
• 2018
This work presents the first efficient online learning algorithms in this setting that guarantee regret under mild assumptions, and relies on a novel SDP relaxation for the steady-state distribution of the system.
• Computer Science
NIPS
• 2015
It is shown that online-gradient-descent and follow-the-perturbed-leader achieve regret O(√D) in the delayed setting, where D is the sum of delays of each round's feedback.
Online learning with dynamics: A minimax perspective
• Computer Science
NeurIPS
• 2020
This work provides a unifying analysis that recovers regret bounds for several well studied problems including online learning with memory, online control of linear quadratic regulators, online Markov decision processes, and tracking adversarial targets.
Online Learning for Adversaries with Memory: Price of Past Mistakes
• Computer Science
NIPS
• 2015
This work extends the notion of learning with memory to the general Online Convex Optimization (OCO) framework, and presents two algorithms that attain low regret.
Online Optimization with Memory and Competitive Control
• Computer Science
NeurIPS
• 2020
The proposed approach, Optimistic Regularized Online Balanced Descent, achieves a constant, dimension-free competitive ratio and shows a connection between online optimization with memory and online control with adversarial disturbances.
Making Non-Stochastic Control (Almost) as Easy as Stochastic
Recent literature has made much progress in understanding \emph{online LQR}: a modern learning-theoretic take on the classical control problem in which a learner attempts to optimally control an