# Regret minimization in stochastic non-convex learning via a proximal-gradient approach

@inproceedings{Hallak2021RegretMI, title={Regret minimization in stochastic non-convex learning via a proximal-gradient approach}, author={Nadav Hallak and P. Mertikopoulos and V. Cevher}, booktitle={ICML}, year={2021} }

Motivated by applications in machine learning and operations research, we study regret minimization with stochastic first-order oracle feedback in online constrained, and possibly non-smooth, non-convex problems. In this setting, the minimization of external regret is beyond reach for first-order methods, so we focus on a local regret measure defined via a proximal-gradient mapping. To achieve no (local) regret in this setting, we develop a prox-grad method based on stochastic first-order… Expand

#### 5 Citations

Online non-convex optimization with imperfect feedback

- Computer Science, Mathematics
- NeurIPS
- 2020

This work derives a series of tight regret minimization guarantees, both for the learner's static (external) regret, as well as the regret incurred against the best dynamic policy in hindsight, from a general template based on a kernel-based estimator. Expand

Convergence of the Inexact Online Gradient and Proximal-Gradient Under the Polyak-Łojasiewicz Condition

- Computer Science, Mathematics
- ArXiv
- 2021

Convergence results show that the instantaneous regret converges linearly up to an error that depends on the variability of the problem and the statistics of the gradient error; in particular, bounds in expectation and in high probability are provided by leveraging a sub-Weibull model for the errors affecting the gradient. Expand

A Stochastic Operator Framework for Inexact Static and Online Optimization

- Computer Science, Mathematics
- ArXiv
- 2021

A unified stochastic operator framework is provided to analyze the convergence of iterative optimization algorithms for both static problems and online optimization and learning and results in terms of convergence in mean and in high-probability are presented. Expand

Are we Forgetting about Compositional Optimisers in Bayesian Optimisation?

- Computer Science, Mathematics
- ArXiv
- 2020

This paper highlights the empirical advantages of the compositional approach to acquisition function maximisation across 3958 individual experiments comprising synthetic optimisation tasks as well as tasks from the 2020 NeurIPS competition on Black-Box Optimisation for Machine Learning. Expand

OpReg-Boost: Learning to Accelerate Online Algorithms with Operator Regression

- Computer Science, Mathematics
- ArXiv
- 2021

This paper shows how to formalize the operator regression problem and proposes a computationally-efficient Peaceman-Rachford solver that exploits a closed-form solution of simple quadratically-constrained quadratic programs (QCQPs). Expand

#### References

SHOWING 1-10 OF 42 REFERENCES

On the Regret Minimization of Nonconvex Online Gradient Ascent for Online PCA

- Computer Science, Mathematics
- COLT
- 2019

An adversarially-perturbed spiked-covariance model is introduced in which, each data point is assumed to follow a fixed stochastic distribution with a non-zero spectral gap in the covariance matrix, but is then perturbed with some adversarial vector. Expand

Zeroth-Order Non-Convex Learning via Hierarchical Dual Averaging

- Computer Science, Mathematics
- ICML
- 2021

We propose a hierarchical version of dual averaging for zeroth-order online non-convex optimization – i.e., learning processes where, at each stage, the optimizer is facing an unknown non-convex loss… Expand

Logarithmic regret algorithms for online convex optimization

- Mathematics, Computer Science
- Machine Learning
- 2007

Several algorithms achieving logarithmic regret are proposed, which besides being more general are also much more efficient to implement, and give rise to an efficient algorithm based on the Newton method for optimization, a new tool in the field. Expand

Online Optimization with Gradual Variations

- Mathematics, Computer Science
- COLT
- 2012

It is shown that for the linear and general smooth convex loss functions, an online algorithm modified from the gradient descend algorithm can achieve a regret which only scales as the square root of the deviation, and as an application, this can also have such a logarithmic regret for the portfolio management problem. Expand

Non-Stationary Stochastic Optimization

- Computer Science, Mathematics
- Oper. Res.
- 2015

Tight bounds on the minimax regret allow us to quantify the "price of non-stationarity," which mathematically captures the added complexity embedded in a temporally changing environment versus a stationary one. Expand

Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback

- Mathematics, Computer Science
- COLT
- 2010

The multi-point bandit setting, in which the player can query each loss function at multiple points, is introduced, and regret bounds that closely resemble bounds for the full information case are proved. Expand

Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods

- Computer Science, Mathematics
- NeurIPS
- 2019

This paper proposes a multi-step gradient descent-ascent algorithm that finds an \varepsilon--first order stationary point of the game in \widetilde O(\varpsilon^{-3.5}) iterations, which is the best known rate in the literature. Expand

Kernel-based methods for bandit convex optimization

- Mathematics, Computer Science
- STOC
- 2017

We consider the adversarial convex bandit problem and we build the first poly(T)-time algorithm with poly(n) √T-regret for this problem. To do so we introduce three new ideas in the derivative-free… Expand

Lower Bounds for Non-Convex Stochastic Optimization

- Mathematics, Computer Science
- ArXiv
- 2019

It is proved that (in the worst case) any algorithm requires at least $\epsilon^{-4}$ queries to find an stationary point, and establishes that stochastic gradient descent is minimax optimal in this model. Expand

Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2009

A new online algorithm is developed, the regularized dual averaging (RDA) method, that can explicitly exploit the regularization structure in an online setting and can be very effective for sparse online learning with l1-regularization. Expand