Stochastic Second-Order Method for Large-Scale Nonconvex Sparse Learning Models

  title={Stochastic Second-Order Method for Large-Scale Nonconvex Sparse Learning Models},
  author={Hongchang Gao and Heng Huang},
Sparse learning models have shown promising performance in the high dimensional machine learning applications. [] Key Method The proposed method incorporates the second-order information to improve the convergence speed. Theoretical analysis shows that our proposed method enjoys linear convergence rate and guarantees to converge to the underlying true model parameter. Experimental results have verified the efficiency and correctness of our proposed method.

Figures from this paper

Loopless Semi-Stochastic Gradient Descent with Less Hard Thresholding for Sparse Learning

This work proposes an efficient single-layer semi-stochastic gradient hard thresholding (LSSG-HT) method and proves that the algorithm can converge to an optimal solution with a linear convergence rate.

Efficient Relaxed Gradient Support Pursuit for Sparsity Constrained Non-convex Optimization

A new general relaxed gradient support pursuit (RGraSP) framework, in which the sub-algorithm only requires to satisfy a slack descent condition, and two specific semi-stochastic gradient hard thresholding algorithms, which are superior to the state-of-the-art gradient hard thresholds methods.

Stochastic Recursive Gradient Support Pursuit and Its Sparse Representation Applications

This work proposes a new stochastic recursive gradient support pursuit (SRGSP) algorithm, in which only one hard thresholding operation is required in each outer-iteration, which has a significantly lower computational complexity than existing methods such as SG-HT and SVRGHT.

Learning Deep Sparse Regularizers with Applications to Multi-View Clustering and Semi-Supervised Classification.

A deep sparse regularizer learning model that learns data-driven sparse regularizers adaptively and applies this model to multi-view clustering and semi-supervised classification tasks to learn a latent compact representation.

Faster Stochastic Quasi-Newton Methods

A novel faster stochastic QN method (SpiderSQN) based on the variance reduced technique of SIPDER is proposed, and it is proved that this method reaches the best known SFO complexity ofinline-formula, which also matches the existing best result.



Nonconvex Sparse Learning via Stochastic Optimization with Progressive Variance Reduction

A stochastic variance reduced optimization algorithm for solving sparse learning problems with cardinality constraints that enjoys strong linear convergence guarantees and optimal estimation accuracy in high dimensions is proposed.

Gradient Hard Thresholding Pursuit for Sparsity-Constrained Optimization

This paper generalizes HTP from compressive sensing to a generic problem setup of sparsity-constrained convex optimization and proves that the proposed algorithm enjoys the strong guarantees analogous to HTP in terms of rate of convergence and parameter estimation accuracy.

Linear Convergence of Stochastic Iterative Greedy Algorithms With Sparse Constraints

This generalized framework is specialized to the problems of sparse signal recovery in compressed sensing and low-rank matrix recovery, giving methods with provable convergence guarantees that often outperform their deterministic counterparts.

Newton Greedy Pursuit: A Quadratic Approximation Method for Sparsity-Constrained Optimization

The NewTon Greedy Pursuit method to approximately minimizes a twice differentiable function over sparsity constraint is proposed and the superiority of NTGP to several representative first-order greedy selection methods is demonstrated in synthetic and real sparse logistic regression tasks.

A Stochastic Quasi-Newton Method for Large-Scale Optimization

A stochastic quasi-Newton method that is efficient, robust and scalable, and employs the classical BFGS update formula in its limited memory form, based on the observation that it is beneficial to collect curvature information pointwise, and at regular intervals, through (sub-sampled) Hessian-vector products.

Adaptive Forward-Backward Greedy Algorithm for Learning Sparse Representations

  • Tong Zhang
  • Computer Science
    IEEE Transactions on Information Theory
  • 2011
This work proposes a novel combination that is based on the forward greedy algorithm but takes backward steps adaptively whenever beneficial, and develops strong theoretical results for the new procedure showing that it can effectively solve the problem of learning a sparse target function.

Accelerated Stochastic Block Coordinate Gradient Descent for Sparsity Constrained Nonconvex Optimization

An accelerated stochastic block coordinate descent algorithm for nonconvex optimization under sparsity constraint in the high dimensional regime is proposed that converges to the unknown true parameter at a linear rate.

Efficient L1 Regularized Logistic Regression

Theoretical results show that the proposed efficient algorithm for L1 regularized logistic regression is guaranteed to converge to the global optimum, and experiments show that it significantly outperforms standard algorithms for solving convex optimization problems.

A Linearly-Convergent Stochastic L-BFGS Algorithm

It is demonstrated experimentally that the proposed new stochastic L-BFGS algorithm performs well on large-scale convex and non-convex optimization problems, exhibiting linear convergence and rapidly solving the optimization problems to high levels of precision.

Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data

This work considers the problem of estimating the parameters of a Gaussian or binary distribution in such a way that the resulting undirected graphical model is sparse, and presents two new algorithms for solving problems with at least a thousand nodes in the Gaussian case.