# Online Non-Convex Learning: Following the Perturbed Leader is Optimal

@article{Suggala2020OnlineNL, title={Online Non-Convex Learning: Following the Perturbed Leader is Optimal}, author={Arun Sai Suggala and Praneeth Netrapalli}, journal={ArXiv}, year={2020}, volume={abs/1903.08110} }

We study the problem of online learning with non-convex losses, where the learner has access to an offline optimization oracle. We show that the classical Follow the Perturbed Leader (FTPL) algorithm achieves optimal regret rate of $O(T^{-1/2})$ in this setting. This improves upon the previous best-known regret rate of $O(T^{-1/3})$ for FTPL. We further show that an optimistic variant of FTPL achieves better regret bounds when the sequence of losses encountered by the learner is `predictable'.

## 26 Citations

### Non-convex Online Optimization With an Offline Oracle

- Computer Science, Mathematics
- 2019

In this project, we will look at the problem of online optimization in the non-convex setting, assuming that the player has access to an offline oracle. As we will see, it has recently been proven…

### Online non-convex optimization with imperfect feedback

- Computer ScienceNeurIPS
- 2020

This work derives a series of tight regret minimization guarantees, both for the learner's static (external) regret, as well as the regret incurred against the best dynamic policy in hindsight, from a general template based on a kernel-based estimator.

### Follow the Perturbed Leader: Optimism and Fast Parallel Algorithms for Smooth Minimax Games

- Computer ScienceNeurIPS
- 2020

This work shows that when the sequence of loss functions is predictable, a simple modification of FTPL which incorporates optimism can achieve better regret guarantees, while retaining the optimal worst-case regret guarantee for unpredictable sequences.

### Non-Convex Follow the Perturbed Leader ECE 543 Final Project

- Computer Science
- 2019

It would seem that the convexity of the loss functions is not the crucial factor that makes online learning more difficult than offline statistical learning.

### Distributed Online Non-convex Optimization with Composite Regret

- Computer Science2022 58th Annual Allerton Conference on Communication, Control, and Computing (Allerton)
- 2022

A novel composite regret with a new network-based metric to evaluate distributed online optimization algorithms, and shows that DINOCO can achieve sublinear regret; to the authors' knowledge, this is the first regret bound for general distributed online non-convex learning.

### Regret minimization in stochastic non-convex learning via a proximal-gradient approach

- Computer ScienceICML
- 2021

A prox-grad method based on stochastic first-order feedback, and a simpler method for when access to a perfect first- order oracle is possible, are developed and established, both of which are min-max order-optimal.

### Online Convex Optimization with Unbounded Memory

- Computer ScienceArXiv
- 2022

This work introduces a generalization of the OCO framework, “Online Convex Optimization with Unbounded Memory”, that captures long-term dependence on past decisions, and introduces the notion of p -eﬀective memory capacity, H p, that quantifies the maximum inﬂuence of past decisions on current losses.

### Regrets of proximal method of multipliers for online non-convex optimization with long term constraints

- Computer ScienceJournal of Global Optimization
- 2022

OPMM is proved to be an implementable projection method for solving the online non-convex optimization problem and it is demonstrated that the regret of the objective reduction can be established even the feasible set is non- Convex.

### A Unifying Framework for Online Optimization with Long-Term Constraints

- Computer Science
- 2022

The algorithm is the first to provide guarantees in the adversarial setting with respect to the optimal strategy that satisﬁes the long-term constraints, and guarantees a ρ/ (1 + ρ ) fraction of the optimal reward and sublinear regret, where ρ is a feasibility parameter related to the existence of strictly feasible solutions.

### Online Bilevel Optimization: Regret Analysis of Online Alternating Gradient Methods

- Computer ScienceArXiv
- 2022

New notions of bilevel regret are introduced, an online alternating time-averaged gradient method is developed that is capable of leveraging smoothness, and regret bounds are extended in terms of the path-length of the inner and outer minimizer sequences.

## References

SHOWING 1-10 OF 15 REFERENCES

### Online Learning with Non-Convex Losses and Non-Stationary Regret

- Computer ScienceAISTATS
- 2018

A sublinear regret bound is established for online learning with non-convex loss functions and non-stationary regret measure by establishing a cumulative regret bound of O( √ T + VTT ), where VT is the total temporal variations of the loss functions.

### 1 Perturbation Techniques in Online Learning and Optimization

- Computer Science
- 2016

It is shown that the classical algorithm known as Follow The Perturbed Leader (FTPL) can be viewed through the lens of stochastic smoothing, a tool that has proven popular within convex optimization.

### Optimization, Learning, and Games with Predictable Sequences

- Computer ScienceNIPS
- 2013

It is proved that a version of Optimistic Mirror Descent can be used by two strongly-uncoupled players in a finite zero-sum matrix game to converge to the minimax equilibrium at the rate of O((log T)/T).

### Online Learning in Adversarial Lipschitz Environments

- Computer Science, MathematicsECML/PKDD
- 2010

A class of algorithms with cumulative regret upper bounded by O(√dt ln(λ) where d is the dimension of the search space, T the time horizon, and λ the Lipschitz constant are provided.

### Efficient Regret Minimization in Non-Convex Games

- Computer ScienceICML
- 2017

A natural notion of regret is defined which permits efficient optimization and generalizes offline guarantees for convergence to an approximate local optimum and gradient-based methods that achieve optimal regret are given.

### Learning in Non-convex Games with an Optimization Oracle

- Computer ScienceCOLT
- 2019

By slightly strengthening the oracle model, the online and the statistical learning models become computationally equivalent for any Lipschitz and bounded function.

### The Hedge Algorithm on a Continuum

- Mathematics, Computer ScienceICML
- 2015

A generalized Hedge algorithm is proposed and a O(√tlogt) bound on the regret when the losses are uniformly Lipschitz and S is uniformly fat is shown (a weaker condition than convexity).

### Follow-the-Regularized-Leader and Mirror Descent: Equivalence Theorems and L1 Regularization

- Computer ScienceAISTATS
- 2011

It is proved that many mirror descent algorithms for online convex optimization (such as online gradient descent) have an equivalent interpretation as follow-the-regularizedleader (FTRL) algorithms, and the FTRL-Proximal algorithm can be seen as a hybrid of these two algorithms, which significantly outperforms both on a large, realworld dataset.

### Adaptive Online Prediction by Following the Perturbed Leader

- Computer ScienceJ. Mach. Learn. Res.
- 2005

This work derives loss bounds for adaptive learning rate and both finite expert classes with uniform weights and countable Expert classes with arbitrary weights for Follow the Perturbed Leader.

### Online Learning with Predictable Sequences

- Computer ScienceCOLT
- 2013

Methods for online linear optimization that take advantage of benign (as opposed to worst-case) sequences and compete with a set of possible predictable processes concurrently with using it to obtain better regret guarantees are presented.