# SAN: Stochastic Average Newton Algorithm for Minimizing Finite Sums

@inproceedings{Chen2022SANSA, title={SAN: Stochastic Average Newton Algorithm for Minimizing Finite Sums}, author={Jiabin Chen and Rui Yuan and Guillaume Garrigos and Robert Mansel Gower}, booktitle={AISTATS}, year={2022} }

We present a principled approach for designing stochastic Newton methods for solving finite sum optimization problems. Our approach has two steps. First, we rewrite the stationarity conditions as a system of nonlinear equations that associates each data point to a new row. Second, we apply a subsampled Newton Raphson method to solve this system of nonlinear equations. By design, methods developed using our approach are incremental, in that they require only a single data point per iteration…

## One Citation

### SP2: A Second Order Stochastic Polyak Method

- Computer ScienceArXiv
- 2022

This work develops a method for solving the interpolation equations that uses the local second-order approximation of the model, and uses Hessian-vector products to speed-up the convergence of SP.

## References

SHOWING 1-10 OF 50 REFERENCES

### A Superlinearly-Convergent Proximal Newton-type Method for the Optimization of Finite Sums

- Mathematics, Computer ScienceICML
- 2016

A new incremental method whose convergence rate is superlinear - the Newton-type incremental method (NIM), which is to introduce a model of the objective with the same sum-of-functions structure and further update a single component of the model per iteration.

### Sketched Newton-Raphson

- Computer ScienceSIAM Journal on Optimization
- 2022

By showing that SNR can be interpreted as a variant of the stochastic gradient descent (SGD) method, this theory is able to leverage proof techniques of SGD and establish a global convergence theory and rates of convergence for SNR.

### Stochastic Newton and Cubic Newton Methods with Simple Local Linear-Quadratic Rates

- Mathematics, Computer ScienceArXiv
- 2019

This work presents two new remarkably simple stochastic second-order methods for minimizing the average of a very large number of sufficiently smooth and strongly convex functions and establishes local linear-quadratic convergence results.

### Exact and Inexact Subsampled Newton Methods for Optimization

- Computer Science
- 2016

This paper analyzes an inexact Newton method that solves linear systems approximately using the conjugate gradient (CG) method, and that samples the Hessian and not the gradient (the gradient is assumed to be exact).

### Greedy Quasi-Newton Methods with Explicit Superlinear Convergence

- MathematicsSIAM J. Optim.
- 2021

This paper establishes an explicit non-asymptotic bound on their rate of local superlinear convergence, which contains a contraction factor, depending on the square of the iteration counter, and shows that greedy quasi-Newton methods produce Hessian approximations whose deviation from the exact Hessians linearly convergences to zero.

### Newton Sketch: A Near Linear-Time Optimization Algorithm with Linear-Quadratic Convergence

- Computer Science, MathematicsSIAM J. Optim.
- 2017

A randomized second-order method for optimization known as the Newton Sketch, based on performing an approximate Newton step using a randomly projected or sub-sampled Hessian, is proposed, which has super-linear convergence with exponentially high probability and convergence and complexity guarantees that are independent of condition numbers and related problem-dependent quantities.

### IQN: An Incremental Quasi-Newton Method with Local Superlinear Convergence Rate

- MathematicsSIAM J. Optim.
- 2018

IQN is the first stochastic quasi-Newton method proven to converge superlinearly in a local neighborhood of the optimal solution and establishes its local superlinear convergence rate.

### Sub-sampled Newton methods

- Computer Science, MathematicsMath. Program.
- 2019

For large-scale finite-sum minimization problems, we study non-asymptotic and high-probability global as well as local convergence properties of variants of Newton’s method where the Hessian and/or…

### SDNA: Stochastic Dual Newton Ascent for Empirical Risk Minimization

- MathematicsICML
- 2016

Unlike existing methods such as stochastic dual coordinate ascent, SDNA is capable of utilizing all local curvature information contained in the examples, which leads to striking improvements in both theory and practice.

### Convergence rates of sub-sampled Newton methods

- Computer ScienceNIPS
- 2015

This paper uses sub-sampling techniques together with low-rank approximation to design a new randomized batch algorithm which possesses comparable convergence rate to Newton's method, yet has much smaller per-iteration cost.