An Accelerated Doubly Stochastic Gradient Method with Faster Explicit Model Identification
@article{Bao2022AnAD, title={An Accelerated Doubly Stochastic Gradient Method with Faster Explicit Model Identification}, author={R. Bao and Bin Gu and Heng Huang}, journal={Proceedings of the 31st ACM International Conference on Information \& Knowledge Management}, year={2022} }
Sparsity regularized loss minimization problems play an important role in various fields including machine learning, data mining, and modern statistics. Proximal gradient descent method and coordinate descent method are the most popular approaches to solving the minimization problem. Although existing methods can achieve implicit model identification, aka support set identification, in a finite number of iterations, these methods still suffer from huge computational costs and memory burdens in…
3 Citations
RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference
- Computer ScienceArXiv
- 2023
The proliferation of deep learning (DL) has led to the emergence of privacy and security concerns. To address these issues, secure Two-party computation (2PC) has been proposed as a means of enabling…
Synthetic Data Can Also Teach: Synthesizing Effective Data for Unsupervised Visual Representation Learning
- Computer Science
- 2022
A data generation framework with two methods to improve CL training by joint sample generation and contrastive learning is proposed and Experimental results on multiple datasets show superior accu- racy and data efficiency of the proposed data generation methods applied to CL.
Distributed Contrastive Learning for Medical Image Segmentation
- Computer ScienceMedical Image Anal.
- 2022
Two federated self-supervised learning frameworks for volumetric medical image segmentation with limited annotations are proposed and substantially improve the segmentation and generalization performance compared with state-of-the-art techniques.
References
SHOWING 1-10 OF 48 REFERENCES
Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization
- Computer ScienceIJCAI
- 2017
This paper proposes a doubly stochastic algorithm with a novel accelerating multi-momentum technique to solve large scale empirical risk minimization problem for learning tasks.
Doubly Sparse Asynchronous Learning for Stochastic Composite Optimization
- Computer ScienceIJCAI
- 2022
A new accelerated doubly sparse asynchronous learning (DSAL) method for stochastic composite optimization, under which two algorithms are proposed on shared-memory and distributed-memory architecture respectively, which only conducts gradient descent on the nonzero coordinates ( data sparsity) and active set (model sparsity).
Variable Screening for Sparse Online Regression
- MedicineJournal of Computational and Graphical Statistics
- 2022
By combining with a screening rule, it is shown how to eliminate useless features of the iterates generated by online algorithms, and thereby enforce finite sparsity identification.
Efficient Online and Batch Learning Using Forward Backward Splitting
- Computer ScienceJ. Mach. Learn. Res.
- 2009
The two phase approach enables sparse solutions when used in conjunction with regularization functions that promote sparsity, such as l1, l2, l22, and l∞ regularization, and is extended and given efficient implementations for very high-dimensional data with sparsity.
Accelerated Mini-batch Randomized Block Coordinate Descent Method
- Computer ScienceNIPS
- 2014
Amini-batch randomized block coordinate descent (MRBCD) method, which estimates the partial gradient of the selected block based on a mini-batch of randomly sampled data in each iteration, and shows that for strongly convex functions, the MRBCD method attains lower overall iteration complexity than existing RBCD methods.
Screening Rules for Lasso with Non-Convex Sparse Regularizers
- Computer ScienceICML
- 2019
This work is the first that introduces a screening rule strategy into a non-convex Lasso solver using a iterative majorization-minimization strategy and provides guarantees that the inner solver is able to identify the zeros components of its critical point in finite time.
Local Convergence Properties of SAGA/Prox-SVRG and Acceleration
- Computer ScienceICML
- 2018
This paper presents a unified framework for the local convergence analysis of the SAGA/Prox-SVRG algorithms, and discusses various possibilities for accelerating these algorithms, including adapting to better local parameters, and applying higher-order deterministic/stochastic optimisation methods which can achieve super-linear convergence.
Fast OSCAR and OWL Regression via Safe Screening Rules
- Computer ScienceICML
- 2020
This paper proposes the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure via an iterative strategy, which overcomes the difficulties of tackling the non-separable regularizer.
Randomized Block Coordinate Descent for Online and Stochastic Optimization
- Computer ScienceArXiv
- 2014
It is shown that ORBCD can converge at a geometric rate in expectation, matching the convergence rate of SGD with variance reduction and RBCD, and by reducing the variance of stochastic gradients.
Blitz: A Principled Meta-Algorithm for Scaling Sparse Optimization
- Computer ScienceICML
- 2015
BLITZ is a fast working set algorithm accompanied by useful guarantees that outperforms existing solvers in sequential, limited-memory, and distributed settings and is not specific to l1-regularized learning, making the algorithm relevant to many applications involving sparsity or constraints.