• Corpus ID: 229340501

Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework

@article{Sharma2020ZerothOrderHG,
  title={Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework},
  author={Pranay Sharma and Kaidi Xu and Sijia Liu and Pin-Yu Chen and Xue Lin and Pramod K. Varshney},
  journal={ArXiv},
  year={2020},
  volume={abs/2012.11518}
}
In this work, we focus on the study of stochastic zeroth-order (ZO) optimization which does not require first-order gradient information and uses only function evaluations. The problem of ZO optimization has emerged in many recent machine learning applications, where the gradient of the objective function is either unavailable or difficult to compute. In such cases, we can approximate the full gradients or stochastic gradients through function value based gradient estimates. Here, we propose a… 

Figures and Tables from this paper

Convergence Analysis of Nonconvex Distributed Stochastic Zeroth-order Coordinate Method
TLDR
A ZO distributed primal–dual coordinate method (ZODIAC) to solve the stochastic optimization problem of minimizing a global cost function formed by the summation of n local cost functions by involving zeroth-order (ZO) information exchange is proposed.
Accelerated Zeroth-order Algorithm for Stochastic Distributed Nonconvex Optimization
TLDR
A zeroth-order (ZO) distributed primal-dual stochastic coordinates algorithm equipped with “powerball” method to accelerate and it is proved that the proposed algorithm has a convergence rate of O(√p/ √ nT ) for general nonconvex cost functions.

References

SHOWING 1-10 OF 41 REFERENCES
Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization
TLDR
Two accelerated versions of ZO-SVRG utilizing variance reduced gradient estimators are proposed, which achieve the best rate known for ZO stochastic optimization (in terms of iterations) and strike a balance between the convergence rate and the function query complexity.
Min-Max Optimization without Gradients: Convergence and Applications to Black-Box Evasion and Poisoning Attacks
TLDR
A principled optimization framework, integrating a zeroth-order (ZO) gradient estimator with an alternating projected stochastic gradient descent-ascent method, where the former only requires a small number of function queries and the later needs just one-step descent/ascent update.
Improved Zeroth-Order Variance Reduced Algorithms and Analysis for Nonconvex Optimization
TLDR
A new algorithm is developed, which is free from Gaussian variable generation and allows a large constant stepsize while maintaining the same convergence rate and query complexity, and it is shown that ZO-SPIDER-Coord automatically achieves a linear convergence rate as the iterate enters into a local PL region without restart and algorithmic modification.
Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations
TLDR
Focusing on nonasymptotic bounds on convergence rates, it is shown that if pairs of function values are available, algorithms for d-dimensional optimization that use gradient estimates based on random perturbations suffer a factor of at most √d in convergence rate over traditional stochastic gradient methods.
ZO-AdaMM: Zeroth-Order Adaptive Momentum Method for Black-Box Optimization
TLDR
A zeroth-order AdaMM (ZO-AdaMM) algorithm is proposed, that generalizes AdaMM to the gradient-free regime and empirically shows that ZO- adaMM converges much faster to a solution of high accuracy compared with state-of-the-art ZO optimization methods.
On the Information-Adaptive Variants of the ADMM: An Iteration Complexity Perspective
TLDR
A suite of variants of the ADMM, where the trade-offs between the required information on the objective and the computational complexity are explicitly given, are presented, allowing the method to be applicable on a much broader class of problems where only noisy estimations of the gradient or the function values are accessible.
Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming
TLDR
This paper discusses a variant of the algorithm which consists of applying a post-optimization phase to evaluate a short list of solutions generated by several independent runs of the RSG method, and shows that such modification allows to improve significantly the large-deviation properties of the algorithms.
A Primer on Zeroth-Order Optimization in Signal Processing and Machine Learning: Principals, Recent Advances, and Applications
TLDR
This article provides a comprehensive review of Zeroth-order optimization, with an emphasis on showing the underlying intuition, optimization principles, and recent advances in convergence analysis.
Zeroth-order (Non)-Convex Stochastic Optimization via Conditional Gradient and Gradient Updates
In this paper, we propose and analyze zeroth-order stochastic approximation algorithms for nonconvex and convex optimization. Specifically, we propose generalizations of the conditional gradient
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator
TLDR
This paper proposes a new technique named SPIDER, which can be used to track many deterministic quantities of interest with significantly reduced computational cost and proves that SPIDER-SFO nearly matches the algorithmic lower bound for finding approximate first-order stationary points under the gradient Lipschitz assumption in the finite-sum setting.
...
1
2
3
4
5
...