• Corpus ID: 8656612

Adaptive Hedge

  title={Adaptive Hedge},
  author={Tim van Erven and Peter Gr{\"u}nwald and Wouter M. Koolen and Steven de Rooij},
Most methods for decision-theoretic online learning are based on the Hedge algorithm, which takes a parameter called the learning rate. In most previous analyses the learning rate was carefully tuned to obtain optimal worst-case performance, leading to suboptimal performance on easy instances, for example when there exists an action that is significantly better than all others. We propose a new way of setting the learning rate, which adapts to the difficulty of the learning problem: in the… 

Figures from this paper

Learning The Best Expert Efficiently
It is shown that a particular lazy form of the online subgradient algorithm can be used to achieve minimal regret in a number of "easy" regimes while retaining an $O(\sqrt{n})$ worst-case regret guarantee.
Dying Experts: Efficient Algorithms with Optimal Regret Bounds
This work studies a variant of decision-theoretic online learning in which the set of experts that are available to Learner can shrink over time, and provides matching upper and lower bounds on the ranking regret in the fully adversarial setting.
k-experts - Online Policies and Fundamental Limits
A tight regret lower bound is published for a variant of the k -experts problem, a generalization of the classic Prediction with Expert’s Advice framework, and the mistake bounds achievable by online learning policies for stable loss functions are characterized.
Best of many worlds: Robust model selection for online supervised learning
This work shows that by incorporating a probabilistic framework of structural risk minimization into existing adaptive algorithms, they can robustly learn not only the presence of stochastic structure when it exists, but also the correct model order.
Learning the Learning Rate for Prediction with Expert Advice
This work introduces an approach to learn the learning rate, which performs as well as if it would know the empirically best learning rate from a large range that includes both conservative small values and values that are much higher than those for which formal guarantees were previously available.
Open Problem: Shifting Experts on Easy Data
A number of online algorithms have been developed that have small additional loss (regret) compared to the best \shifting expert". In this model, there is a set of experts and the comparator is the
$k\texttt{-experts}$ -- Online Policies and Fundamental Limits
SAGE (Sampled Hedge) a framework for designing efficient online learning policies by leveraging statistical sampling techniques is proposed and it is shown that SAGE either achieves the first sublinear regret guarantee or improves upon the existing ones.
Shifting Experts on Easy Data
This work asks what are the simple and ecient algorithms for instances of the shifting experts problem when the loss sequence in each segment is iid and how to eciently unite the performance of such algorithms on easy data with worst-case robustness.
Follow the Leader with Dropout Perturbations
This simple, tuning-free version of the FPL algorithm achieves two feats: optimal worst-case O( p L lnK + lnk) regret as a function ofL , and optimalO(lnK) regret when the loss vectors are drawn i.i.d. from a fixed distribution and there is a gap between the expected loss of the best expert and all others.
Decision Making under Stochastic Constraints
Lagrangian exponentially weightedAverage (LEWA) algorithm is proposed, which is a primal-dual variant of the well known exponentially weighted average algorithm, and inspired by the theory of Lagrangian method in constrained optimization.


A game of prediction with expert advice
  • V. Vovk
  • Computer Science
    COLT '95
  • 1995
We consider the following problem. At each point of discrete time the learner must make a prediction; he is given the predictions made by a pool of experts. Each prediction and the outcome, which is
The Weighted Majority Algorithm
A simple and effective method, based on weighted voting, is introduced for constructing a compound algorithm in a situation in which a learner faces a sequence of trials, and the goal of the learner is to make few mistakes.
Sequential Prediction of Individual Sequences Under General Loss Functions
A simplified analysis with an explicit closed form for the constant in the minimax regret formula is given, and a probabilistic argument is given that shows this constant is the best possible.
Prediction, learning, and games
This chapter discusses prediction with expert advice, efficient forecasters for large classes of experts, and randomized prediction for specific losses.
Competitive On‐line Statistics
A radically new approach to statistical modelling, which combines mathematical techniques of Bayesian statistics with the philosophy of the theory of competitive on‐line algorithms, has arisen over
This course can be used as a preparation for the first (Probability) actuarial exam and the central limit theorem and classical sampling distributions.
Adaptive and Self-Confident On-Line Learning Algorithms
This paper shows that essentially the same optimized bounds can be obtained when the algorithms adaptively tune their learning rates as the examples in the sequence are progressively revealed, as they depend on the whole sequence of examples.
How to use expert advice
This work analyzes algorithms that predict a binary value by combining the predictions of several prediction strategies, called `experts', and shows how this leads to certain kinds of pattern recognition/learning algorithms with performance bounds that improve on the best results currently known in this context.
Extracting certainty from uncertainty: regret bounded by variation in costs
The question whether it is be possible to bound the regret of an online algorithm by the variation of the observed costs is resolved, and bounds in the fully adversarial setting are proved, in two important online learning scenarios: prediction from expert advice, and online linear optimization.
A decision-theoretic generalization of on-line learning and an application to boosting
The model studied can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting, and it is shown that the multiplicative weight-update Littlestone?Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems.