# A Variance Controlled Stochastic Method with Biased Estimation for Faster Non-convex Optimization

@article{Bi2021AVC,
title={A Variance Controlled Stochastic Method with Biased Estimation for Faster Non-convex Optimization},
author={Jia Bi and Steve R. Gunn},
journal={ArXiv},
year={2021},
volume={abs/2102.09893}
}
• Published 19 February 2021
• Computer Science
• ArXiv
. This paper proposes a new novelty optimization method Variance Controlled Stochastic Gradient (VCSG) to improve the performance of the stochastic variance reduced gradient (SVRG) algorithm. To avoid over-reducing the variance of gradient by SVRG, a hyper-parameter λ is introduced in VCSG that is able to control the reduced variance of SVRG. Theory shows that the optimization method can converge by using an unbiased gradient estimator, but in practice, biased gradient estimation can allow more…
2 Citations
• Computer Science
ECCV
• 2022
This work proposes an adaptive algorithm that accurately estimates drift across clients in Federated Learning and induces stability by constraining the norm of estimates for client drift, making it more practical for large scale FL.
• Computer Science
BMVC
• 2021
The proposed GhostShiftAddNet can achieve higher classification accuracy with fewer FLOPs and parameters (reduced by up to 3×) than GhostNet and inference latency on the Jetson Nano is improved by 1.3× and 2× on the GPU and CPU respectively.

## References

SHOWING 1-10 OF 43 REFERENCES

• Jie Chen
• Computer Science
ArXiv
• 2018
This work shows, in a general setting, that consistent gradient estimators result in the same convergence behavior as do unbiased ones, and opens several new research directions, including the development of more efficient SGD updates with consistent estimators and the design of efficient training algorithms for large-scale graphs.
• Computer Science
NIPS
• 2013
It is proved that this method enjoys the same fast convergence rate as those of stochastic dual coordinate ascent (SDCA) and Stochastic Average Gradient (SAG), but the analysis is significantly simpler and more intuitive.
• Computer Science
ICML
• 2016
This work proves non-asymptotic rates of convergence of SVRG for nonconvex optimization, and shows that it is provably faster than SGD and gradient descent.
• Computer Science
NeurIPS
• 2018
This work proposes a new stochastic gradient descent algorithm based on nested variance reduction that improves the best known gradient complexity of SVRG and the bestgradient complexity of SCSG.
• Computer Science
Math. Program.
• 2016
The AG method is generalized to solve nonconvex and possibly stochastic optimization problems and it is demonstrated that by properly specifying the stepsize policy, the AG method exhibits the best known rate of convergence for solving general non Convex smooth optimization problems by using first-order information, similarly to the gradient descent method.
• Computer Science, Mathematics
SIAM J. Optim.
• 2009
It is intended to demonstrate that a properly modified SA approach can be competitive and even significantly outperform the SAA method for a certain class of convex stochastic problems.
• Computer Science, Mathematics
ArXiv
• 2019
It is proved that (in the worst case) any algorithm requires at least $\epsilon^{-4}$ queries to find an stationary point, and establishes that stochastic gradient descent is minimax optimal in this model.
• Computer Science, Mathematics
NeurIPS
• 2018
This paper proposes a new technique named SPIDER, which can be used to track many deterministic quantities of interest with significantly reduced computational cost and proves that SPIDER-SFO nearly matches the algorithmic lower bound for finding approximate first-order stationary points under the gradient Lipschitz assumption in the finite-sum setting.
This work embeds both LMS and steepest descent, as well as other intermediate methods, within a one-parameter class of algorithms, and proposes a hybrid class of methods that combine the faster early convergence rate of LMS with the faster ultimate linear convergence rates of steepmost descent.
• Computer Science
2016 IEEE 55th Conference on Decision and Control (CDC)
• 2016
This paper analyzes the SAGA algorithm within an Incremental First-order Oracle framework, and shows that it converges to a stationary point provably faster than both gradient descent and stochastic gradient descent.