• Corpus ID: 52979229

Dynamic Channel Pruning: Feature Boosting and Suppression

@article{Gao2019DynamicCP,
  title={Dynamic Channel Pruning: Feature Boosting and Suppression},
  author={Xitong Gao and Yiren Zhao and Lukasz Dudziak and Robert D. Mullins and Chengzhong Xu},
  journal={ArXiv},
  year={2019},
  volume={abs/1810.05331}
}
Making deep convolutional neural networks more accurate typically comes at the cost of increased computational and memory resources. [] Key Method In contrast to channel pruning methods which permanently remove channels, it preserves the full network structures and accelerates convolution by dynamically skipping unimportant input and output channels. FBS-augmented networks are trained with conventional stochastic gradient descent, making it readily available for many state-of-the-art CNNs. We compare FBS to…

Figures and Tables from this paper

Channel Gating Neural Networks
TLDR
An accelerator is designed for channel gating, a dynamic, fine-grained, and hardware-efficient pruning scheme to reduce the computation cost for convolutional neural networks (CNNs), which optimizes CNN inference at run-time by exploiting input-specific characteristics.
Dynamic Channel and Layer Gating in Convolutional Neural Networks
TLDR
It is argued, that combining the recently proposed channel gating mechanism with layer gating can significantly reduce the computational cost of large CNNs.
Deep Neural Network Acceleration With Sparse Prediction Layers
TLDR
This work proposes an efficient yet general scheme called Sparse Prediction Layer (SPL) which can predict and skip the trivial elements in the CNN layer and can further accelerate these networks pruned by other pruning-based methods.
Understanding the Impact of Dynamic Channel Pruning on Conditionally Parameterized Convolutions
TLDR
This paper analyzes a recent method, Feature Boosting and Suppression (FBS), which dynamically assesses which channels contain the most important input-dependent features and prune the others based on a runtime threshold gating mechanism and discovers that substituting standard convolutional filters with input-specific filters, as described in CondConv, enables FBS to address this accuracy loss.
Differentiable Channel Pruning Search
TLDR
A novel weight sharing technique which can elegantly eliminate the shape mismatching problem with negligible additional resource is introduced and it achieves the state-of-the-art pruning results for image classification on CIFAR-10, CIFar-100 and ImageNet.
Batch-shaping for learning conditional channel gated networks
TLDR
This work introduces a new residual block architecture that gates convolutional channels in a fine-grained manner and introduces a generally applicable tool that matches the marginal aggregate posteriors of features in a neural network to a pre-specified prior distribution.
Dynamic Group Convolution for Accelerating Convolutional Neural Networks
TLDR
This paper proposes dynamic group convolution (DGC) that adaptively selects which part of input channels to be connected within each group for individual samples on the fly, and has similar computational efficiency as the conventional group Convolution simultaneously.
Dynamic Dual Gating Neural Networks
TLDR
Dynamic dual gating is proposed, a new dynamic computing method to reduce the model complexity at run-time and can achieve higher accuracy under similar computing budgets compared with other dynamic execution methods.
Loss Constrains Added Squeeze and Excitation Blocks for Pruning Deep Neural Networks
  • Yiqin WangMing Li Xiaozhou Xu
  • Computer Science
    2020 16th International Conference on Control, Automation, Robotics and Vision (ICARCV)
  • 2020
TLDR
This work comes up with a new pruning approach that needs no finetuning and the criteria of the approach is activation rather than trainable parameter, which make the pruning process more stable and time saving.
Focused Quantization for Sparse CNNs
TLDR
This paper attends to the statistical properties of sparse CNNs and presents focused quantization, a novel quantization strategy based on power-of-two values, which exploits the weight distributions after fine-grained pruning, significantly reducing model sizes.
...
...

References

SHOWING 1-10 OF 44 REFERENCES
Channel Gating Neural Networks
TLDR
An accelerator is designed for channel gating, a dynamic, fine-grained, and hardware-efficient pruning scheme to reduce the computation cost for convolutional neural networks (CNNs), which optimizes CNN inference at run-time by exploiting input-specific characteristics.
Channel Pruning for Accelerating Very Deep Neural Networks
  • Yihui HeX. ZhangJian Sun
  • Computer Science
    2017 IEEE International Conference on Computer Vision (ICCV)
  • 2017
TLDR
This paper proposes an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel selection and least square reconstruction, and generalizes this algorithm to multi-layer and multi-branch cases.
Learning Efficient Convolutional Networks through Network Slimming
TLDR
The approach is called network slimming, which takes wide and large networks as input models, but during training insignificant channels are automatically identified and pruned afterwards, yielding thin and compact models with comparable accuracy.
ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression
TLDR
ThiNet is proposed, an efficient and unified framework to simultaneously accelerate and compress CNN models in both training and inference stages, and it is revealed that it needs to prune filters based on statistics information computed from its next layer, not the current layer, which differentiates ThiNet from existing methods.
Discrimination-aware Channel Pruning for Deep Neural Networks
TLDR
This work investigates a simple-yet-effective method, called discrimination-aware channel pruning, to choose those channels that really contribute to discriminative power and proposes a greedy algorithm to conduct channel selection and parameter optimization in an iterative way.
Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers
TLDR
This paper proposes a channel pruning technique for accelerating the computations of deep convolutional neural networks (CNNs) that focuses on direct simplification of the channel-to-channel computation graph of a CNN without the need of performing a computationally difficult and not-always-useful task.
Pruning Filters for Efficient ConvNets
TLDR
This work presents an acceleration method for CNNs, where it is shown that even simple filter pruning techniques can reduce inference costs for VGG-16 and ResNet-110 by up to 38% on CIFAR10 while regaining close to the original accuracy by retraining the networks.
SBNet: Sparse Blocks Network for Fast Inference
TLDR
This work leverages the sparsity structure of computation masks and proposes a novel tiling-based sparse convolution algorithm that is effective on LiDAR-based 3D object detection, and reports significant wall-clock speed-ups compared to dense convolution without noticeable loss of accuracy.
Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks
TLDR
The proposed Soft Filter Pruning (SFP) method enables the pruned filters to be updated when training the model after pruning, which has two advantages over previous works: larger model capacity and less dependence on the pretrained model.
Learning Structured Sparsity in Deep Neural Networks
TLDR
The results show that for CIFAR-10, regularization on layer depth can reduce 20 layers of a Deep Residual Network to 18 layers while improve the accuracy from 91.25% to 92.60%, which is still slightly higher than that of original ResNet with 32 layers.
...
...