• Corpus ID: 235490676

Cogradient Descent for Dependable Learning

@article{Wang2021CogradientDF,
  title={Cogradient Descent for Dependable Learning},
  author={Runqi Wang and Baochang Zhang and Li'an Zhuo and Qixiang Ye and David S. Doermann},
  journal={ArXiv},
  year={2021},
  volume={abs/2106.10617}
}
Conventional gradient descent methods compute the gradients for multiple variables through the partial derivative. Treating the coupled variables independently while ignoring the interaction, however, leads to an insufficient optimization for bilinear models. In this paper, we propose a dependable learning based on Cogradient Descent (CoGD) algorithm to address the bilinear optimization problem, providing a systematic way to coordinate the gradients of coupling variables based on a kernelized… 

References

SHOWING 1-10 OF 67 REFERENCES

Cogradient Descent for Bilinear Optimization

A Cogradient Descent algorithm to address the bilinear problem, based on a theoretical framework to coordinate the gradient of hidden variables via a projection function, which improves the state-of-the-art by a significant margin.

Factorized Bilinear Models for Image Recognition

A novel Factorized Bilinear (FB) layer is proposed to model the pairwise feature interactions by considering the quadratic terms in the transformations of CNNs to reduce the risk of overfitting.

Centripetal SGD for Pruning Very Deep Convolutional Networks With Complicated Structure

Centripetal SGD (C-SGD), a novel optimization method, which can train several filters to collapse into a single point in the parameter hyperspace, is proposed, which partly solved an open problem of constrained filter pruning on CNNs with complicated structure, where some layers must be pruned following the others.

Consensus Convolutional Sparse Coding

By learning CSC features from large-scale image datasets for the first time, this paper achieves significant quality improvements in a number of imaging tasks and enables new applications in high-dimensional feature learning that has been intractable using existing CSC methods.

Bilinear Modeling via Augmented Lagrange Multipliers (BALM)

A unified approach to solve different bilinear factorization problems in computer vision in the presence of missing data in the measurements via Augmented Lagrange Multipliers, which can seamlessly handle different computer vision problems.

Adam: A Method for Stochastic Optimization

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

Data-Driven Sparse Structure Selection for Deep Neural Networks

A simple and effective framework to learn and prune deep models in an end-to-end manner by adding sparsity regularizations on factors, and solving the optimization problem by a modified stochastic Accelerated Proximal Gradient (APG) method.

Towards Optimal Structured CNN Pruning via Generative Adversarial Learning

  • Shaohui LinR. Ji D. Doermann
  • Computer Science
    2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
This paper proposes an effective structured pruning approach that jointly prunes filters as well as other structures in an end-to-end manner and effectively solves the optimization problem by generative adversarial learning (GAL), which learns a sparse soft mask in a label-free and an end to end manner.

Image Reconstruction via Manifold Constrained Convolutional Sparse Coding for Image Sets

The proposed MCSC is a generic approach as it achieves better results than the state-of-the-art approaches based on convolutional sparse coding in other image reconstruction tasks, such as face reconstruction, digit reconstruction, and image restoration.

Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration

Unlike previous methods, FPGM compresses CNN models by pruning filters with redundancy, rather than those with“relatively less” importance, and when applied to two image classification benchmarks, the method validates its usefulness and strengths.
...