• Corpus ID: 199000989

Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity

@article{Huang2019NonconvexZS,
  title={Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity},
  author={Feihu Huang and Shangqian Gao and Jian Pei and Heng Huang},
  journal={ArXiv},
  year={2019},
  volume={abs/1907.13463}
}
Zeroth-order (gradient-free) method is a class of powerful optimization tool for many machine learning problems because it only needs function values (not gradient) in the optimization. In particular, zeroth-order method is very suitable for many complex problems such as black-box attacks and bandit feedback, whose explicit gradients are difficult or infeasible to obtain. Recently, although many zeroth-order methods have been developed, these approaches still exist two main drawbacks: 1) high… 

Figures and Tables from this paper

Accelerated Zeroth-Order Momentum Methods from Mini to Minimax Optimization

An accelerated momentum descent ascent (Acc-MDA) method is presented for solving the white-box minimax problems, and it is proved that it achieves the best known gradient complexity of $\tilde{O}(\kappa_y^3\epsilon^{-3})$ without large batches.

SpiderBoost and Momentum: Faster Stochastic Variance Reduction Algorithms

This paper proposes SpiderBoost as an improved scheme, which allows to use a much larger constant-level stepsize while maintaining the same near-optimal oracle complexity, and can be extended with proximal mapping to handle composite optimization (which is nonsmooth and nonconvex) with provable convergence guarantee.

Accelerated Stochastic Gradient-free and Projection-free Methods

An accelerated stochastic zeroth-order Frank-Wolfe (Acc-SZOFW) method based on the variance reduced technique of SPIDER/SpiderBoost and a novel momentum accelerated technique is proposed, which still reaches the function query complexity of O(d\epsilon^{-3}) in the stoChastic problem without relying on any large batches.

Faster Stochastic Quasi-Newton Methods

A novel faster stochastic QN method (SpiderSQN) based on the variance reduced technique of SIPDER is proposed, and it is proved that this method reaches the best known SFO complexity ofinline-formula, which also matches the existing best result.

Discrete Model Compression With Resource Constraint for Deep Neural Networks

An efficient discrete optimization method to directly optimize channel-wise differentiable discrete gate under resource constraint while freezing all the other model parameters, which is globally discrimination-aware due to the discrete setting.

Desirable Companion for Vertical Federated Learning: New Zeroth-Order Gradient Based Algorithm

This paper reveals that zeroth-order optimization (ZOO) is a desirable companion for VFL and proposes a novel and practical VFL framework with black-box models, which is inseparably interconnected to the promising properties of ZOO.

AsySQN: Faster Vertical Federated Learning Algorithms with Better Computation Resource Utilization

An asynchronous stochastic quasi-Newton (AsySQN) framework for VFL is proposed, under which three algorithms making descent steps scaled by approximate Hessian information convergence much faster than SGD-based methods in practice and thus can dramatically reduce the number of communication rounds.

On the Stability and Convergence of Robust Adversarial Reinforcement Learning: A Case Study on Linear Quadratic Systems

This work reexamine the effectiveness of RARL under a fundamental robust control setting: the linear quadratic (LQ) case, and proposes several other policy-based R ARL algorithms whose convergence behaviors are studied both empirically and theoretically.

Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization

It is proved that the Acc-ZOM method achieves a lower query complexity of Õ(d −3) for finding an -stationary point, which improves the best known result by a factor of O(d) where d denotes the parameter dimension.

References

SHOWING 1-10 OF 41 REFERENCES

Zeroth-Order Stochastic Alternating Direction Method of Multipliers for Nonconvex Nonsmooth Optimization

A class of fast zeroth-order stochastic ADMM methods for solving nonconvex problems with multiple nonsmooth penalties, based on the coordinate smoothing gradient estimator, which not only reach the best convergence rate for the non Convex optimization, but also are able to effectively solve many complex machine learning problems withmultiple regularized penalties and constraints.

Faster Gradient-Free Proximal Stochastic Methods for Nonconvex Nonsmooth Optimization

A class of faster zeroth-order proximal stochastic methods with the variance reduction techniques of SVRG and SAGA are proposed, which are denoted as ZO-ProxSVRGs and ZO -ProxSAGA, respectively.

Stochastic Zeroth-order Optimization via Variance Reduction method

This paper introduces a novel Stochastic Zeroth-order method with Variance Reduction under Gaussian smoothing (SZVR-G) and establishes the complexity for optimizing non-convex problems and successfully applies the method to conduct a universal black-box attack to deep neural networks.

Improved Zeroth-Order Variance Reduced Algorithms and Analysis for Nonconvex Optimization

A new algorithm is developed, which is free from Gaussian variable generation and allows a large constant stepsize while maintaining the same convergence rate and query complexity, and it is shown that ZO-SPIDER-Coord automatically achieves a linear convergence rate as the iterate enters into a local PL region without restart and algorithmic modification.

Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization

Two accelerated versions of ZO-SVRG utilizing variance reduced gradient estimators are proposed, which achieve the best rate known for ZO stochastic optimization (in terms of iterations) and strike a balance between the convergence rate and the function query complexity.

Faster Stochastic Alternating Direction Method of Multipliers for Nonconvex Optimization

The theoretical analysis shows that the online SPIDER-ADMM has the IFO complexity of $\mathcal{O}(\epsilon^{-\frac{3}{2}})$, which improves the existing best results by a factor of $n$ and the experimental results on benchmark datasets validate that the proposed algorithms have faster convergence rate than the existing ADMM algorithms for nonconvex optimization.

Stochastic Alternating Direction Method of Multipliers

This paper establishes the convergence rate of ADMM for convex problems in terms of both the objective value and the feasibility violation, and proposes a stochastic ADMM algorithm for optimization problems with non-smooth composite objective functions.

Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization

A randomized stochastic projected gradient (RSPG) algorithm, in which proper mini-batch of samples are taken at each iteration depending on the total budget of Stochastic samples allowed, is proposed, which shows nearly optimal complexity of the algorithm for convex stoChastic programming.

Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming

This paper discusses a variant of the algorithm which consists of applying a post-optimization phase to evaluate a short list of solutions generated by several independent runs of the RSG method, and shows that such modification allows to improve significantly the large-deviation properties of the algorithms.

SpiderBoost: A Class of Faster Variance-reduced Algorithms for Nonconvex Optimization

SpiderBoost is proposed as an improved scheme that allows much larger stepsize without sacrificing the convergence rate, and hence runs substantially faster in practice, and extends much more easily to proximal algorithms with guaranteed convergence for solving composite optimization problems.