Improving Deep Neural Network Sparsity through Decorrelation Regularization

  title={Improving Deep Neural Network Sparsity through Decorrelation Regularization},
  author={Xiaotian Zhu and Wen-gang Zhou and Houqiang Li},
Modern deep learning models usually suffer high complexity in model size and computation when transplanted to resource constrained platforms. To this end, many works are dedicated to compressing deep neural networks. Adding group LASSO regularization is one of the most effective model compression methods since it generates structured sparse networks. We investigate the deep neural networks trained by group LASSO constraint and observe that even with strong sparsity regularization imposed, there… 

Figures and Tables from this paper

Discrimination-Aware Network Pruning for Deep Model Compression

This paper proposes a simple-yet-effective method called discrimination-aware channel pruning (DCP) to choose the channels that actually contribute to the discriminative power of the network, and proposes a greedy algorithm to solve the resultant problem.

Structure injected weight normalization for training deep networks

The proposed deep structural weight normalization methods can reduce the number of trainable parameters while guaranteeing high accuracy, whereas DSWN-NM can accelerate the convergence while improving the performance of deep networks.

The Role of Regularization in Shaping Weight and Node Pruning Dependency and Dynamics

A novel framework for weight pruning by sampling from a probability function that favors the zeroing of smaller weights is presented and the contribution of $L_1$ and $L-0$ regularization to the dynamics of node pruning while optimizing for weightPruning is examined.

Self-Orthogonality Module: A Network Architecture Plug-in for Learning Orthogonal Filters

This paper proposes to introduce an implicit self-regularization into OR to push the mean and variance of filter angles in a network towards 90° and 0° simultaneously to achieve (near) orthogonality among the filters, without using any other explicit regularization.

CondenseNet with exclusive lasso regularization

CondenseNet-elasso is developed to eliminate feature correlation among different convolution groups and alleviate neural network’s overfitting problem and applies exclusive lasso regularization on CondenseNet.

Learning Compact Representations of Neural Networks using DiscriminAtive Masking (DAM)

This work presents a novel single-stage structured pruning method termed DiscriminAtive Masking (DAM), to discriminatively prefer some of the neurons to be refined during the training process, while gradually masking out other neurons.

BLK-REW: A Unified Block-based DNN Pruning Framework using Reweighted Regularization Method

A new block-based pruning framework that comprises a general and flexible structured pruning dimension as well as a powerful and efficient reweighted regularization method that achieves universal coverage for both CNNs and RNNs with real-time mobile acceleration and no accuracy compromise is proposed.

Layer-Wise Network Compression Using Gaussian Mixture Model

This work proposes a layer adaptive pruning method based on the modeling of weight distribution that shows higher compression rate during maintaining the accuracy compared with previous methods and applies the proposed method for image classification and semantic segmentation.

AutoPrune: Automatic Network Pruning by Regularizing Auxiliary Parameters

This work proposes AutoPrune, which prunes the network through optimizing a set of trainable auxiliary parameters instead of original weights, and can automatically eliminate network redundancy with recoverability, relieving the complicated prior knowledge required to design thresholding functions, and reducing the time for trial and error.

SS-Auto: A Single-Shot, Automatic Structured Weight Pruning Framework of DNNs with Ultra-High Efficiency

SS-Auto is proposed, a single-shot, automatic structured pruning framework that can achieve row pruning and column pruning simultaneously and adopt soft constraint-based formulation to alleviate the strong non-convexity of l0-norm constraints used in state-of-the-art ADMM-based methods for faster convergence and fewer hyperparameters.



Learning Structured Sparsity in Deep Neural Networks

The results show that for CIFAR-10, regularization on layer depth can reduce 20 layers of a Deep Residual Network to 18 layers while improve the accuracy from 91.25% to 92.60%, which is still slightly higher than that of original ResNet with 32 layers.

Data-Driven Sparse Structure Selection for Deep Neural Networks

A simple and effective framework to learn and prune deep models in an end-to-end manner by adding sparsity regularizations on factors, and solving the optimization problem by a modified stochastic Accelerated Proximal Gradient (APG) method.

Adaptive Layerwise Quantization for Deep Neural Network Compression

An adaptive layerwise quantization method which quantizes the network with different bitwidth assigned to different layers by using entropy of weights and activations as an importance indicator for each layer.

Towards Convolutional Neural Networks Compression via Global Error Reconstruction

A global error reconstruction method termed GER is presented, which firstly leverages an SVD-based low-rank approximation to coarsely compress the parameters in the fully connected layers in a layerwise manner, and jointly optimized in a global perspective via back-propagation.

Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures

This paper introduces network trimming which iteratively optimizes the network by pruning unimportant neurons based on analysis of their outputs on a large dataset, inspired by an observation that the outputs of a significant portion of neurons in a large network are mostly zero.

Pruning Filters for Efficient ConvNets

This work presents an acceleration method for CNNs, where it is shown that even simple filter pruning techniques can reduce inference costs for VGG-16 and ResNet-110 by up to 38% on CIFAR10 while regaining close to the original accuracy by retraining the networks.

Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition

A simple two-step approach for speeding up convolution layers within large convolutional neural networks based on tensor decomposition and discriminative fine-tuning is proposed, leading to higher obtained CPU speedups at the cost of lower accuracy drops for the smaller of the two networks.

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression

ThiNet is proposed, an efficient and unified framework to simultaneously accelerate and compress CNN models in both training and inference stages, and it is revealed that it needs to prune filters based on statistics information computed from its next layer, not the current layer, which differentiates ThiNet from existing methods.

Learning Efficient Convolutional Networks through Network Slimming

The approach is called network slimming, which takes wide and large networks as input models, but during training insignificant channels are automatically identified and pruned afterwards, yielding thin and compact models with comparable accuracy.

Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications

A simple and effective scheme to compress the entire CNN, called one-shot whole network compression, which addresses the important implementation level issue on 1?1 convolution, which is a key operation of inception module of GoogLeNet as well as CNNs compressed by the proposed scheme.