AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference

@article{Luo2020AutoPrunerAE,
  title={AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference},
  author={Jian-Hao Luo and Jianxin Wu},
  journal={ArXiv},
  year={2020},
  volume={abs/1805.08941}
}

Figures and Tables from this paper

HeadStart: Enforcing Optimal Inceptions in Pruning Deep Neural Networks for Efficient Inference on GPGPUs
TLDR
It is proved that optimal inception will be more likely to induce a satisfied performance and shortened fine-tuning iterations, and a reinforcement learning based solution, termed as HeadStart, seeking to learn the best way of pruning aiming at the optimal inception is proposed.
Channel Pruning via Automatic Structure Search
TLDR
This paper proposes a new channel pruning method based on artificial bee colony algorithm (ABC), dubbed as ABCPruner, which aims to efficiently find optimal pruned structure, i.e., channel number in each layer, rather than selecting "important" channels as previous works did.
Neural Network Pruning With Residual-Connections and Limited-Data
  • Jian-Hao LuoJianxin Wu
  • Computer Science
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
TLDR
CURL significantly outperforms previous state-of-the-art methods on ImageNet, and when pruning on small datasets, CURL achieves comparable or much better performance than fine-tuning a pretrained small model.
Softer Pruning, Incremental Regularization
TLDR
This work proposed a SofteR Filter Pruning (SRFP) method and its variant, Asymptotic Softe R Filter Pruned (ASRFP), simply decaying the pruned weights with a monotonic decreasing parameter, and notes that SRFP and ASRFP pursue better results while slowing down the speed of convergence.
Pruning from Scratch
TLDR
This work finds that pre-training an over-parameterized model is not necessary for obtaining the target pruned structure, and empirically shows that more diverse pruned structures can be directly pruned from randomly initialized weights, including potential models with better performance.
Automatic Neural Network Pruning that Efficiently Preserves the Model Accuracy
TLDR
An automatic pruning method that learns which neurons to preserve in order to maintain the model accuracy while reducing the FLOPs to a predefined target is proposed.
Provable Filter Pruning for Efficient Neural Networks
TLDR
This work presents a provable, sampling-based approach for generating compact Convolutional Neural Networks by identifying and removing redundant filters from an over-parameterized network and constructs an importance sampling distribution where filters that highly affect the output are sampled with correspondingly high probability.
Plug-in, Trainable Gate for Streamlining Arbitrary Neural Networks
TLDR
The proposed trainable gate function, which confers a differentiable property to discrete-valued variables, allows us to directly optimize loss functions that include non-differentiable discrete values such as 0-1 selection.
...
...

References

SHOWING 1-10 OF 66 REFERENCES
Rethinking the Value of Network Pruning
TLDR
It is found that with optimal learning rate, the "winning ticket" initialization as used in Frankle & Carbin (2019) does not bring improvement over random initialization, and the need for more careful baseline evaluations in future research on structured pruning methods is suggested.
Pruning Convolutional Neural Networks for Resource Efficient Inference
TLDR
It is shown that pruning can lead to more than 10x theoretical (5x practical) reduction in adapted 3D-convolutional filters with a small drop in accuracy in a recurrent gesture classifier.
Importance Estimation for Neural Network Pruning
TLDR
A novel method that estimates the contribution of a neuron (filter) to the final loss and iteratively removes those with smaller scores and two variations of this method using the first and second-order Taylor expansions to approximate a filter's contribution are described.
Play and Prune: Adaptive Filter Pruning for Deep Model Compression
TLDR
This work presents a new min-max framework for filter-level pruning of CNNs, which reduces the number of parameters of VGG-16 by an impressive factor of 17.5X, and number of FLOPS by 6.43X, with no loss of accuracy, significantly outperforming other state-of-the-art filter pruning methods.
ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression
TLDR
ThiNet is proposed, an efficient and unified framework to simultaneously accelerate and compress CNN models in both training and inference stages, and it is revealed that it needs to prune filters based on statistics information computed from its next layer, not the current layer, which differentiates ThiNet from existing methods.
Structured Pruning for Efficient ConvNets via Incremental Regularization
TLDR
A new and novel regularization-based pruning method, named IncReg, to incrementally assign different regularization factors to different weights based on their relative importance is proposed, which achieves comparable to even better results compared with state-of-the-arts.
Towards Optimal Structured CNN Pruning via Generative Adversarial Learning
  • Shaohui LinR. Ji D. Doermann
  • Computer Science
    2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
TLDR
This paper proposes an effective structured pruning approach that jointly prunes filters as well as other structures in an end-to-end manner and effectively solves the optimization problem by generative adversarial learning (GAL), which learns a sparse soft mask in a label-free and an end to end manner.
Channel Pruning for Accelerating Very Deep Neural Networks
  • Yihui HeX. ZhangJian Sun
  • Computer Science
    2017 IEEE International Conference on Computer Vision (ICCV)
  • 2017
TLDR
This paper proposes an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel selection and least square reconstruction, and generalizes this algorithm to multi-layer and multi-branch cases.
Accelerating Convolutional Networks via Global & Dynamic Filter Pruning
TLDR
This paper proposes a novel global & dynamic pruning (GDP) scheme to prune redundant filters for CNN acceleration that achieves superior performance to accelerate several cutting-edge CNNs on the ILSVRC 2012 benchmark.
NISP: Pruning Networks Using Neuron Importance Score Propagation
  • Ruichi YuAng Li L. Davis
  • Computer Science
    2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
TLDR
The Neuron Importance Score Propagation (NISP) algorithm is proposed to propagate the importance scores of final responses to every neuron in the network and is evaluated on several datasets with multiple CNN models and demonstrated to achieve significant acceleration and compression with negligible accuracy loss.
...
...