An Accelerated Doubly Stochastic Gradient Method with Faster Explicit Model Identification

  title={An Accelerated Doubly Stochastic Gradient Method with Faster Explicit Model Identification},
  author={R. Bao and Bin Gu and Heng Huang},
  journal={Proceedings of the 31st ACM International Conference on Information \& Knowledge Management},
  • R. BaoBin GuHeng Huang
  • Published 11 August 2022
  • Computer Science
  • Proceedings of the 31st ACM International Conference on Information & Knowledge Management
Sparsity regularized loss minimization problems play an important role in various fields including machine learning, data mining, and modern statistics. Proximal gradient descent method and coordinate descent method are the most popular approaches to solving the minimization problem. Although existing methods can achieve implicit model identification, aka support set identification, in a finite number of iterations, these methods still suffer from huge computational costs and memory burdens in… 

Figures and Tables from this paper

RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference

The proliferation of deep learning (DL) has led to the emergence of privacy and security concerns. To address these issues, secure Two-party computation (2PC) has been proposed as a means of enabling

Synthetic Data Can Also Teach: Synthesizing Effective Data for Unsupervised Visual Representation Learning

A data generation framework with two methods to improve CL training by joint sample generation and contrastive learning is proposed and Experimental results on multiple datasets show superior accu- racy and data efficiency of the proposed data generation methods applied to CL.

Distributed Contrastive Learning for Medical Image Segmentation

Two federated self-supervised learning frameworks for volumetric medical image segmentation with limited annotations are proposed and substantially improve the segmentation and generalization performance compared with state-of-the-art techniques.



Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization

This paper proposes a doubly stochastic algorithm with a novel accelerating multi-momentum technique to solve large scale empirical risk minimization problem for learning tasks.

Doubly Sparse Asynchronous Learning for Stochastic Composite Optimization

A new accelerated doubly sparse asynchronous learning (DSAL) method for stochastic composite optimization, under which two algorithms are proposed on shared-memory and distributed-memory architecture respectively, which only conducts gradient descent on the nonzero coordinates ( data sparsity) and active set (model sparsity).

Variable Screening for Sparse Online Regression

By combining with a screening rule, it is shown how to eliminate useless features of the iterates generated by online algorithms, and thereby enforce finite sparsity identification.

Efficient Online and Batch Learning Using Forward Backward Splitting

The two phase approach enables sparse solutions when used in conjunction with regularization functions that promote sparsity, such as l1, l2, l22, and l∞ regularization, and is extended and given efficient implementations for very high-dimensional data with sparsity.

Accelerated Mini-batch Randomized Block Coordinate Descent Method

Amini-batch randomized block coordinate descent (MRBCD) method, which estimates the partial gradient of the selected block based on a mini-batch of randomly sampled data in each iteration, and shows that for strongly convex functions, the MRBCD method attains lower overall iteration complexity than existing RBCD methods.

Screening Rules for Lasso with Non-Convex Sparse Regularizers

This work is the first that introduces a screening rule strategy into a non-convex Lasso solver using a iterative majorization-minimization strategy and provides guarantees that the inner solver is able to identify the zeros components of its critical point in finite time.

Local Convergence Properties of SAGA/Prox-SVRG and Acceleration

This paper presents a unified framework for the local convergence analysis of the SAGA/Prox-SVRG algorithms, and discusses various possibilities for accelerating these algorithms, including adapting to better local parameters, and applying higher-order deterministic/stochastic optimisation methods which can achieve super-linear convergence.

Fast OSCAR and OWL Regression via Safe Screening Rules

This paper proposes the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure via an iterative strategy, which overcomes the difficulties of tackling the non-separable regularizer.

Randomized Block Coordinate Descent for Online and Stochastic Optimization

It is shown that ORBCD can converge at a geometric rate in expectation, matching the convergence rate of SGD with variance reduction and RBCD, and by reducing the variance of stochastic gradients.

Blitz: A Principled Meta-Algorithm for Scaling Sparse Optimization

BLITZ is a fast working set algorithm accompanied by useful guarantees that outperforms existing solvers in sequential, limited-memory, and distributed settings and is not specific to l1-regularized learning, making the algorithm relevant to many applications involving sparsity or constraints.