# Relaxing the I.I.D. Assumption: Adaptive Minimax Optimal Sequential Prediction with Expert Advice

@article{Bilodeau2020RelaxingTI, title={Relaxing the I.I.D. Assumption: Adaptive Minimax Optimal Sequential Prediction with Expert Advice}, author={Blair Bilodeau and Jeffrey Negrea and Daniel M. Roy}, journal={ArXiv}, year={2020}, volume={abs/2007.06552} }

We consider sequential prediction with expert advice when the data are generated stochastically, but the distributions generating the data may vary arbitrarily among some constraint set. We quantify relaxations of the classical I.I.D. assumption in terms of possible constraint sets, with I.I.D. at one extreme, and an adversarial mechanism at the other. The Hedge algorithm, long known to be minimax optimal in the adversarial regime, has recently been shown to also be minimax optimal in the I.I.D… Expand

#### Topics from this paper

#### References

SHOWING 1-10 OF 38 REFERENCES

Online Learning: Stochastic, Constrained, and Smoothed Adversaries

- Computer Science, Mathematics
- NIPS
- 2011

This work defines the minimax value of a game where the adversary is restricted in his moves, capturing stochastic and non-stochastic assumptions on data and defines a notion of distribution-dependent Rademacher complexity for the spectrum of problems ranging from i.i.d. to worst-case. Expand

Adaptation to Easy Data in Prediction with Limited Advice

- Computer Science, Mathematics
- NeurIPS
- 2018

An online learning algorithm with improved regret guarantees for ``easy'' loss sequences and in the stochastic setting SODA achieves an pseudo-regret bound that holds simultaneously with the adversarial regret guarantee. Expand

Combining Adversarial Guarantees and Stochastic Fast Rates in Online Learning

- Computer Science, Mathematics
- NIPS
- 2016

This work considers online learning algorithms that guarantee worst-case regret rates in adversarial environments, yet adapt optimally to favorable stochastic environments (so they will perform well in a variety of settings of practical importance), and quantifies the friendliness of stoChastic environments by means of the well-known Bernstein condition. Expand

On the optimality of the Hedge algorithm in the stochastic regime

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2019

It is proved that anytime Hedge with decreasing learning rate, which is one of the simplest algorithm for the problem of prediction with expert advice, is remarkably both worst-case optimal and adaptive to the easier stochastic and adversarial with a gap problems. Expand

An Optimal Algorithm for Stochastic and Adversarial Bandits

- Computer Science, Mathematics
- AISTATS
- 2019

The proposed algorithm enjoys improved regret guarantees in two intermediate regimes: the moderately contaminated stochastic regime defined by Seldin and Slivkins (2014) and the stochastically constrained adversary studied by Wei and Luo (2018). Expand

The Best of Both Worlds: Stochastic and Adversarial Bandits

- Mathematics, Computer Science
- COLT
- 2012

SAO (Stochastic and Adversarial Optimal) combines the O( √ n) worst-case regret of Exp3 and the (poly)logarithmic regret of UCB1 for stochastic rewards for adversarial rewards. Expand

Minimax Policies for Adversarial and Stochastic Bandits.

- Computer Science
- COLT 2009
- 2009

This work fills in a long open gap in the characterization of the minimax rate for the multi-armed bandit prob- lem and proposes a new family of randomized algorithms based on an implicit normalization, as well as a new analysis. Expand

Prediction with Expert Advice by Following the Perturbed Leader for General Weights

- Computer Science, Mathematics
- ALT
- 2004

The analysis of the alternative “Follow the Perturbed Leader” (FPL) algorithm from [KV03] (based on Hannan’s algorithm) is easier, and loss bounds for adaptive learning rate and both finite Expert classes with uniform weights and countable expert classes with arbitrary weights are derived. Expand

One Practical Algorithm for Both Stochastic and Adversarial Bandits

- Computer Science, Engineering
- ICML
- 2014

The algorithm is based on augmentation of the EXP3 algorithm with a new control lever in the form of exploration parameters that are tailored individually for each arm, and retains "logarithmic" regret guarantee in the stochastic regime even when some observations are contaminated by an adversary. Expand

A second-order bound with excess losses

- Mathematics, Computer Science
- COLT
- 2014

Online aggregation of the predictions of experts is studied, and new second-order regret bounds in the standard setting are obtained via a version of the Prod algorithm with multiple learning rates and two versions of the polynomially weighted average algorithm. Expand