Corpus ID: 236956700

Online Multiobjective Minimax Optimization and Applications

  title={Online Multiobjective Minimax Optimization and Applications},
  author={Georgy Noarov and Mallesh M. Pai and Aaron Roth},
We introduce a simple but general online learning framework, in which at every round, an adaptive adversary introduces a new game, consisting of an action space for the learner, an action space for the adversary, and a vector valued objective function that is convex-concave in every coordinate. The learner and the adversary then play in this game. The learner’s goal is to play so as to minimize the maximum coordinate of the cumulative vector-valued loss. The resulting oneshot game is not convex… Expand
1 Citations
A User Friendly Power Tool for Deriving Online Learning Algorithms (Invited Talk)
This talk overviews a simple and user friendly framework that can be used to derive online learning algorithms in a number of settings, and uses it to derive a new variant of Blackwell’s Approachability Theorem, which is term “Fast Polytope Approachability”. Expand


Efficient learning algorithms for changing environments
A different performance metric is proposed which strengthens the standard metric of regret and measures performance with respect to a changing comparator and can be applied to various learning scenarios, i.e. online portfolio selection, for which there are experimental results showing the advantage of adaptivity. Expand
A General Class of No-Regret Learning Algorithms and Game-Theoretic Equilibria
The tightest game-theoretic solution concept to which Φ-no-regret algorithms (provably) converge is correlated equilibrium, and Nash equilibrium is not a necessary outcome of learning via any Φ -no- Regret learning algorithms. Expand
Exponential Weight Approachability, Applications to Calibration and Regret Minimization
Basic ideas behind the “exponential weight algorithm” (designed for aggregation or minimization of regret) can be transposed into the theory of Blackwell approachability, and an algorithm is developed bounding the distance of average vector payoffs to some product set, with a logarithmic dependency in the dimension of the ambient space. Expand
From External to Internal Regret
This paper gives a simple generic reduction that, given an algorithms for the external regret problem, converts it to an efficient online algorithm for the internal regret problem and derives a quantitative regret bound for a very general setting of regret. Expand
A Closer Look at Adaptive Regret
It is proved that Fixed Share is optimal for adaptive regret: the worst-case adaptive regret of any algorithm is at least that of an instance of Fixed Share. Expand
Blackwell Approachability and No-Regret Learning are Equivalent
It is shown that any algorithm for one such problem can be eciently converted into an algorithm for the other, and one novel application of this reduction is provided: the rst ecient algorithm for calibrated forecasting. Expand
An analog of the minimax theorem for vector payoffs.
for all i, j . Thus in the (two-person, zero-sum) game with matrix Λf, player I has a strategy insuring an expected gain of at least v, and player II has a strategy insuring an expected loss of atExpand
Online Learning: Beyond Regret
This framework simultaneously captures such well-known notions as internal and general regret, learning with non-additive global cost functions, Blackwell’s approachability, calibration of forecasters, and more and shows that learnability in all these situations is due to control of the same three quantities. Expand
Regret bounds for sleeping experts and bandits
This work compares algorithms against the payoff obtained by the best ordering of the actions, which is a natural benchmark for this type of problem and gives algorithms achieving information-theoretically optimal regret bounds with respect to the best-ordering benchmark. Expand
Relax and Randomize : From Value to Algorithms
A principled way of deriving online learning algorithms from a minimax analysis is shown, including a family of randomized methods that use the idea of a "random playout" and new versions of the Follow-the-Perturbed-Leader algorithms. Expand