• Corpus ID: 83458554

Online Non-Convex Learning: Following the Perturbed Leader is Optimal

  title={Online Non-Convex Learning: Following the Perturbed Leader is Optimal},
  author={Arun Sai Suggala and Praneeth Netrapalli},
We study the problem of online learning with non-convex losses, where the learner has access to an offline optimization oracle. We show that the classical Follow the Perturbed Leader (FTPL) algorithm achieves optimal regret rate of $O(T^{-1/2})$ in this setting. This improves upon the previous best-known regret rate of $O(T^{-1/3})$ for FTPL. We further show that an optimistic variant of FTPL achieves better regret bounds when the sequence of losses encountered by the learner is `predictable'. 

Non-convex Online Optimization With an Offline Oracle

In this project, we will look at the problem of online optimization in the non-convex setting, assuming that the player has access to an offline oracle. As we will see, it has recently been proven

Online non-convex optimization with imperfect feedback

This work derives a series of tight regret minimization guarantees, both for the learner's static (external) regret, as well as the regret incurred against the best dynamic policy in hindsight, from a general template based on a kernel-based estimator.

Follow the Perturbed Leader: Optimism and Fast Parallel Algorithms for Smooth Minimax Games

This work shows that when the sequence of loss functions is predictable, a simple modification of FTPL which incorporates optimism can achieve better regret guarantees, while retaining the optimal worst-case regret guarantee for unpredictable sequences.

Non-Convex Follow the Perturbed Leader ECE 543 Final Project

It would seem that the convexity of the loss functions is not the crucial factor that makes online learning more difficult than offline statistical learning.

Distributed Online Non-convex Optimization with Composite Regret

A novel composite regret with a new network-based metric to evaluate distributed online optimization algorithms, and shows that DINOCO can achieve sublinear regret; to the authors' knowledge, this is the first regret bound for general distributed online non-convex learning.

Regret minimization in stochastic non-convex learning via a proximal-gradient approach

A prox-grad method based on stochastic first-order feedback, and a simpler method for when access to a perfect first- order oracle is possible, are developed and established, both of which are min-max order-optimal.

Online Bilevel Optimization: Regret Analysis of Online Alternating Gradient Methods

New notions of bilevel regret are introduced, an online alternating time-averaged gradient method is developed that is capable of leveraging smoothness, and regret bounds are extended in terms of the path-length of the inner and outer minimizer sequences.

Online learning with dynamics: A minimax perspective

This work provides a unifying analysis that recovers regret bounds for several well studied problems including online learning with memory, online control of linear quadratic regulators, online Markov decision processes, and tracking adversarial targets.

Zeroth-Order Non-Convex Learning via Hierarchical Dual Averaging

We propose a hierarchical version of dual averaging for zeroth-order online non-convex optimization – i.e., learning processes where, at each stage, the optimizer is facing an unknown non-convex loss

Differentially Private Objective Perturbation: Beyond Smoothness and Convexity

It is found that for the problem of learning linear classifiers, directly optimizing for 0/1 loss using the approach can out-perform the more standard approach of privately optimizing a convex-surrogate loss function on the Adult dataset.



Online Learning with Non-Convex Losses and Non-Stationary Regret

A sublinear regret bound is established for online learning with non-convex loss functions and non-stationary regret measure by establishing a cumulative regret bound of O( √ T + VTT ), where VT is the total temporal variations of the loss functions.

Optimization, Learning, and Games with Predictable Sequences

It is proved that a version of Optimistic Mirror Descent can be used by two strongly-uncoupled players in a finite zero-sum matrix game to converge to the minimax equilibrium at the rate of O((log T)/T).

Online Learning in Adversarial Lipschitz Environments

A class of algorithms with cumulative regret upper bounded by O(√dt ln(λ) where d is the dimension of the search space, T the time horizon, and λ the Lipschitz constant are provided.

Efficient Regret Minimization in Non-Convex Games

A natural notion of regret is defined which permits efficient optimization and generalizes offline guarantees for convergence to an approximate local optimum and gradient-based methods that achieve optimal regret are given.

Learning in Non-convex Games with an Optimization Oracle

By slightly strengthening the oracle model, the online and the statistical learning models become computationally equivalent for any Lipschitz and bounded function.

The Hedge Algorithm on a Continuum

A generalized Hedge algorithm is proposed and a O(√tlogt) bound on the regret when the losses are uniformly Lipschitz and S is uniformly fat is shown (a weaker condition than convexity).

Follow-the-Regularized-Leader and Mirror Descent: Equivalence Theorems and L1 Regularization

It is proved that many mirror descent algorithms for online convex optimization (such as online gradient descent) have an equivalent interpretation as follow-the-regularizedleader (FTRL) algorithms, and the FTRL-Proximal algorithm can be seen as a hybrid of these two algorithms, which significantly outperforms both on a large, realworld dataset.

Adaptive Online Prediction by Following the Perturbed Leader

This work derives loss bounds for adaptive learning rate and both finite expert classes with uniform weights and countable Expert classes with arbitrary weights for Follow the Perturbed Leader.

Online Learning with Predictable Sequences

Methods for online linear optimization that take advantage of benign (as opposed to worst-case) sequences and compete with a set of possible predictable processes concurrently with using it to obtain better regret guarantees are presented.

Efficient algorithms for online decision problems