• Corpus ID: 162184306

Acceleration of SVRG and Katyusha X by Inexact Preconditioning

@inproceedings{Liu2019AccelerationOS,
  title={Acceleration of SVRG and Katyusha X by Inexact Preconditioning},
  author={Yanli Liu and Fei Feng and Wotao Yin},
  booktitle={ICML},
  year={2019}
}
Empirical risk minimization is an important class of optimization problems with many popular machine learning applications, and stochastic variance reduction methods are popular choices for solving them. Among these methods, SVRG and Katyusha X (a Nesterov accelerated SVRG) achieve fast convergence without substantial memory requirement. In this paper, we propose to accelerate these two algorithms by \textit{inexact preconditioning}, the proposed methods employ \textit{fixed} preconditioners… 
Stochastic Variance-Reduced Newton: Accelerating Finite-Sum Minimization with Large Batches
TLDR
SVRN is proposed, a Stochastic Variance-Reduced Newton algorithm which enjoys all the benefits of second-order methods: simple unit step size, easily parallelizable large-batch operations, and fast local convergence, while at the same time taking advantage of variance reduction to achieve improved convergence rates for smooth and strongly convex problems.
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods
TLDR
This paper revisits and improves the convergence of policy gradient, natural PG (NPG) methods, and their variance-reduced variants, under general smooth policy parametrizations, and proposes SRVR-NPG, which incorporates variancereduction into the NPG update.

References

SHOWING 1-10 OF 34 REFERENCES
An Inexact Variable Metric Proximal Point Algorithm for Generic Quasi-Newton Acceleration
TLDR
An inexact variable-metric proximal point algorithm to accelerate gradient-based optimization algorithms and is compatible with composite objectives, meaning that it has the ability to provide exactly sparse solutions when the objective involves a sparsity-inducing regularization.
Breaking the Span Assumption Yields Fast Finite-Sum Minimization
In this paper, we show that SVRG and SARAH can be modified to be fundamentally faster than all of the other standard algorithms that minimize the sum of $n$ smooth functions, such as SAGA, SAG, SDCA,
Katyusha: the first direct acceleration of stochastic gradient methods
TLDR
Katyusha momentum is introduced, a novel "negative momentum" on top of Nesterov's momentum that can be incorporated into a variance-reduction based algorithm and speed it up, and in each of such cases, one could potentially give Katyusha a hug.
Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling
TLDR
This paper improves the best known running time of accelerated coordinate descent by a factor up to $n, based on a clean, novel non-uniform sampling that selects each coordinate with a probability proportional to the square root of its smoothness parameter.
Improved SVRG for Non-Strongly-Convex or Sum-of-Non-Convex Objectives
TLDR
It is shown that SVRG is one such method: being originally designed for strongly convex objectives, it is also very robust in non-strongly convex or sum-of-non-convex settings.
IQN: An Incremental Quasi-Newton Method with Local Superlinear Convergence Rate
TLDR
IQN is the first stochastic quasi-Newton method proven to converge superlinearly in a local neighborhood of the optimal solution and establishes its local superlinear convergence rate.
Stochastic dual coordinate ascent methods for regularized loss
TLDR
A new analysis of Stochastic Dual Coordinate Ascent (SDCA) is presented showing that this class of methods enjoy strong theoretical guarantees that are comparable or better than SGD.
A Stochastic Quasi-Newton Method for Large-Scale Optimization
TLDR
A stochastic quasi-Newton method that is efficient, robust and scalable, and employs the classical BFGS update formula in its limited memory form, based on the observation that it is beneficial to collect curvature information pointwise, and at regular intervals, through (sub-sampled) Hessian-vector products.
Stochastic proximal quasi-Newton methods for non-convex composite optimization
TLDR
This paper proposes a generic algorithmic framework for stochastic proximal quasi-Newton (SPQN) methods to solve non-convex composite optimization problems and proposes a modified self-scaling symmetric rank one incorporated in the framework for SPQN method, which is called Stochastic symmetricRank one method.
Stochastic Block BFGS: Squeezing More Curvature out of Data
TLDR
Numerical tests on large-scale logistic regression problems reveal that the proposed novel limited-memory stochastic block BFGS update is more robust and substantially outperforms current state-of-the-art methods.
...
...