Implicit Parameter-free Online Learning with Truncated Linear Models

@inproceedings{Chen2022ImplicitPO,
  title={Implicit Parameter-free Online Learning with Truncated Linear Models},
  author={Keyi Chen and Ashok Cutkosky and Francesco Orabona},
  booktitle={ALT},
  year={2022}
}
Parameter-free algorithms are online learning algorithms that do not require setting learning rates. They achieve optimal regret with respect to the distance between the initial point and any competitor. Yet, parameter-free algorithms do not take into account the geometry of the losses. Recently, in the stochastic optimization literature, it has been proposed to instead use truncated linear lower bounds, which produce better performance by more closely modeling the losses. In particular… 

Figures and Tables from this paper

Better Parameter-free Stochastic Optimization with ODE Updates for Coin-Betting
TLDR
This paper shows empirically that this new parameter-free algorithm based on continuous-time Coin-Betting on truncated models outperforms algorithms with the "best default" learning rates and almost matches the performance of finely tuned baselines without anything to tune.
Parameter-free Mirror Descent
TLDR
A modified online mirror descent framework that is suitable for building adaptive and parameter-free algorithms in unbounded domains is developed and it is demonstrated that natural strategies based on Follow-the-Regularized-Leader are unable to achieve similar results.

References

SHOWING 1-10 OF 45 REFERENCES
Fully Implicit Online Learning
TLDR
This paper studies a class of regularized online algorithms without linearizing the loss function or the regularizer, which is called FIOL, and shows that for arbitrary Bregman divergence, FIOL has the regret for general convex setting and the regret has an one-step improvement effect because it avoids the approximation error of linearization.
Adaptive scale-invariant online algorithms for learning linear models
TLDR
This paper proposes online algorithms making predictions which are invariant under arbitrary rescaling of the features, which achieve regret bounds matching that of OGD with optimally tuned separate learning rates per dimension, while retaining comparable runtime performance.
Better Parameter-free Stochastic Optimization with ODE Updates for Coin-Betting
TLDR
This paper shows empirically that this new parameter-free algorithm based on continuous-time Coin-Betting on truncated models outperforms algorithms with the "best default" learning rates and almost matches the performance of finely tuned baselines without anything to tune.
Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning
TLDR
This paper proposes a new kernel-based stochastic gradient descent algorithm that performs model selection while training, with no parameters to tune, nor any form of cross-validation, to estimate over time the right regularization in a data-dependent way.
Parameter-free Online Convex Optimization with Sub-Exponential Noise
TLDR
It is shown that it is possible to go around the lower bound by allowing the observed subgradients to be unbounded via stochastic noise, and a novel parameter-free OCO algorithm for Banach space, which is called BANCO, achieves the optimal regret rate.
Lipschitz and Comparator-Norm Adaptivity in Online Learning
TLDR
Two prior reductions to the unbounded setting are generalized; one to not need hints, and a second to deal with the range ratio problem (which already arises in prior work).
Online Learning Without Prior Information
TLDR
This work describes a frontier of new lower bounds on the performance of optimization and online learning algorithms, reflecting a tradeoff between a term that depends on the optimal parameter value and a terms that depend on the gradients' rate of growth.
Implicit Online Learning
TLDR
This paper analyzes a class of online learning algorithms based on fixed potentials and non-linearized losses, which yields algorithms with implicit update rules, and provides improved algorithms and bounds for the online metric learning problem, and shows improved robustness for online linear prediction problems.
Training Deep Networks without Learning Rates Through Coin Betting
TLDR
This paper proposes a new stochastic gradient descent procedure for deep networks that does not require any learning rate setting and reduces the optimization process to a game of betting on a coin and proposes a learning-rate-free optimal algorithm.
Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations
TLDR
A novel characterization of a large class of minimax algorithms, recovering, and even improving, several previous results as immediate corollaries, and developing an algorithm that provides a regret bound of O U q T log(U p T log 2 T + 1) , where U is the L2 norm of an arbitrary comparator and both T and U are unknown to the player.
...
...