• Corpus ID: 83458554

Online Non-Convex Learning: Following the Perturbed Leader is Optimal

  title={Online Non-Convex Learning: Following the Perturbed Leader is Optimal},
  author={Arun Sai Suggala and Praneeth Netrapalli},
We study the problem of online learning with non-convex losses, where the learner has access to an offline optimization oracle. We show that the classical Follow the Perturbed Leader (FTPL) algorithm achieves optimal regret rate of $O(T^{-1/2})$ in this setting. This improves upon the previous best-known regret rate of $O(T^{-1/3})$ for FTPL. We further show that an optimistic variant of FTPL achieves better regret bounds when the sequence of losses encountered by the learner is `predictable'. 

Non-convex Online Optimization With an Offline Oracle

In this project, we will look at the problem of online optimization in the non-convex setting, assuming that the player has access to an offline oracle. As we will see, it has recently been proven

Online non-convex optimization with imperfect feedback

This work derives a series of tight regret minimization guarantees, both for the learner's static (external) regret, as well as the regret incurred against the best dynamic policy in hindsight, from a general template based on a kernel-based estimator.

Follow the Perturbed Leader: Optimism and Fast Parallel Algorithms for Smooth Minimax Games

This work shows that when the sequence of loss functions is predictable, a simple modification of FTPL which incorporates optimism can achieve better regret guarantees, while retaining the optimal worst-case regret guarantee for unpredictable sequences.

Non-Convex Follow the Perturbed Leader ECE 543 Final Project

It would seem that the convexity of the loss functions is not the crucial factor that makes online learning more difficult than offline statistical learning.

Distributed Online Non-convex Optimization with Composite Regret

A novel composite regret with a new network-based metric to evaluate distributed online optimization algorithms, and shows that DINOCO can achieve sublinear regret; to the authors' knowledge, this is the first regret bound for general distributed online non-convex learning.

Regret minimization in stochastic non-convex learning via a proximal-gradient approach

A prox-grad method based on stochastic first-order feedback, and a simpler method for when access to a perfect first- order oracle is possible, are developed and established, both of which are min-max order-optimal.

Online Convex Optimization with Unbounded Memory

This work introduces a generalization of the OCO framework, “Online Convex Optimization with Unbounded Memory”, that captures long-term dependence on past decisions, and introduces the notion of p -effective memory capacity, H p, that quantifies the maximum influence of past decisions on current losses.

Regrets of proximal method of multipliers for online non-convex optimization with long term constraints

OPMM is proved to be an implementable projection method for solving the online non-convex optimization problem and it is demonstrated that the regret of the objective reduction can be established even the feasible set is non- Convex.

A Unifying Framework for Online Optimization with Long-Term Constraints

The algorithm is the first to provide guarantees in the adversarial setting with respect to the optimal strategy that satisfies the long-term constraints, and guarantees a ρ/ (1 + ρ ) fraction of the optimal reward and sublinear regret, where ρ is a feasibility parameter related to the existence of strictly feasible solutions.

Online Bilevel Optimization: Regret Analysis of Online Alternating Gradient Methods

New notions of bilevel regret are introduced, an online alternating time-averaged gradient method is developed that is capable of leveraging smoothness, and regret bounds are extended in terms of the path-length of the inner and outer minimizer sequences.



Online Learning with Non-Convex Losses and Non-Stationary Regret

A sublinear regret bound is established for online learning with non-convex loss functions and non-stationary regret measure by establishing a cumulative regret bound of O( √ T + VTT ), where VT is the total temporal variations of the loss functions.

1 Perturbation Techniques in Online Learning and Optimization

It is shown that the classical algorithm known as Follow The Perturbed Leader (FTPL) can be viewed through the lens of stochastic smoothing, a tool that has proven popular within convex optimization.

Optimization, Learning, and Games with Predictable Sequences

It is proved that a version of Optimistic Mirror Descent can be used by two strongly-uncoupled players in a finite zero-sum matrix game to converge to the minimax equilibrium at the rate of O((log T)/T).

Online Learning in Adversarial Lipschitz Environments

A class of algorithms with cumulative regret upper bounded by O(√dt ln(λ) where d is the dimension of the search space, T the time horizon, and λ the Lipschitz constant are provided.

Efficient Regret Minimization in Non-Convex Games

A natural notion of regret is defined which permits efficient optimization and generalizes offline guarantees for convergence to an approximate local optimum and gradient-based methods that achieve optimal regret are given.

Learning in Non-convex Games with an Optimization Oracle

By slightly strengthening the oracle model, the online and the statistical learning models become computationally equivalent for any Lipschitz and bounded function.

The Hedge Algorithm on a Continuum

A generalized Hedge algorithm is proposed and a O(√tlogt) bound on the regret when the losses are uniformly Lipschitz and S is uniformly fat is shown (a weaker condition than convexity).

Follow-the-Regularized-Leader and Mirror Descent: Equivalence Theorems and L1 Regularization

It is proved that many mirror descent algorithms for online convex optimization (such as online gradient descent) have an equivalent interpretation as follow-the-regularizedleader (FTRL) algorithms, and the FTRL-Proximal algorithm can be seen as a hybrid of these two algorithms, which significantly outperforms both on a large, realworld dataset.

Adaptive Online Prediction by Following the Perturbed Leader

This work derives loss bounds for adaptive learning rate and both finite expert classes with uniform weights and countable Expert classes with arbitrary weights for Follow the Perturbed Leader.

Online Learning with Predictable Sequences

Methods for online linear optimization that take advantage of benign (as opposed to worst-case) sequences and compete with a set of possible predictable processes concurrently with using it to obtain better regret guarantees are presented.