# Exploiting Local Convergence of Quasi-Newton Methods Globally: Adaptive Sample Size Approach

@article{Jin2021ExploitingLC, title={Exploiting Local Convergence of Quasi-Newton Methods Globally: Adaptive Sample Size Approach}, author={Qiujiang Jin and Aryan Mokhtari}, journal={ArXiv}, year={2021}, volume={abs/2106.05445} }

In this paper, we study the application of quasi-Newton methods for solving empirical risk minimization (ERM) problems defined over a large dataset. Traditional deterministic and stochastic quasi-Newton methods can be executed to solve such problems; however, it is known that their global convergence rate may not be better than first-order methods, and their local superlinear convergence only appears towards the end of the learning process. In this paper, we use an adaptive sample size scheme…

## Figures and Tables from this paper

## References

SHOWING 1-10 OF 43 REFERENCES

First-Order Adaptive Sample Size Methods to Reduce Complexity of Empirical Risk Minimization

- Computer Science, MathematicsNIPS
- 2017

Theoretical analyses show that the use of adaptive sample size methods reduces the overall computational cost of achieving the statistical accuracy of the whole dataset for a broad range of deterministic and stochastic first-order methods.

Convergence rates of sub-sampled Newton methods

- Computer Science, MathematicsNIPS
- 2015

This paper uses sub-sampling techniques together with low-rank approximation to design a new randomized batch algorithm which possesses comparable convergence rate to Newton's method, yet has much smaller per-iteration cost.

A Stochastic Quasi-Newton Method for Large-Scale Optimization

- Mathematics, Computer ScienceSIAM J. Optim.
- 2016

A stochastic quasi-Newton method that is efficient, robust and scalable, and employs the classical BFGS update formula in its limited memory form, based on the observation that it is beneficial to collect curvature information pointwise, and at regular intervals, through (sub-sampled) Hessian-vector products.

Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy

- Computer Science, MathematicsNIPS
- 2016

It is shown theoretically and empirically that Ada Newton can double the size of the training set in each iteration to achieve the statistical accuracy of the full training set with about two passes over the dataset.

IQN: An Incremental Quasi-Newton Method with Local Superlinear Convergence Rate

- Mathematics, Computer ScienceSIAM J. Optim.
- 2018

IQN is the first stochastic quasi-Newton method proven to converge superlinearly in a local neighborhood of the optimal solution and establishes its local superlinear convergence rate.

Non-asymptotic Superlinear Convergence of Standard Quasi-Newton Methods

- Mathematics, Computer ScienceArXiv
- 2020

This paper provides a finite time (non-asymptotic) convergence analysis for BFGS and DFP methods under the assumptions that the objective function is strongly convex, its gradient is Lipschitz continuous, and its Hessians are continuous only in the direction of the optimal solution.

A globally convergent incremental Newton method

- Mathematics, Computer ScienceMath. Program.
- 2015

It is shown that the incremental Newton method for minimizing the sum of a large number of strongly convex functions is globally convergent for a variable stepsize rule and under a gradient growth condition, convergence rate is linear for both variable and constant stepsize rules.

RES: Regularized Stochastic BFGS Algorithm

- Mathematics, Computer ScienceIEEE Transactions on Signal Processing
- 2014

Convergence results show that lower and upper bounds on the Hessian eigenvalues of the sample functions are sufficient to guarantee almost sure convergence of a subsequence generated by RES and convergence of the sequence in expectation to optimal arguments.

Competing with the Empirical Risk Minimizer in a Single Pass

- Computer Science, MathematicsCOLT
- 2015

This work provides a simple streaming algorithm which, under standard regularity assumptions on the underlying problem, enjoys the following properties: * The algorithm can be implemented in linear time with a single pass of the observed data, using space linear in the size of a single sample.

Global Convergence of a Cass of Quasi-Newton Methods on Convex Problems

- Mathematics
- 1987

We study the global convergence properties of the restricted Broyden class of quasi-Newton methods, when applied to a convex objective function. We assume that the line search satisfies a standard…