• Corpus ID: 239616317

SOSP: Efficiently Capturing Global Correlations by Second-Order Structured Pruning

  title={SOSP: Efficiently Capturing Global Correlations by Second-Order Structured Pruning},
  author={Manuel Nonnenmacher and Thomas Pfeil and Ingo Steinwart and David Reeb},
Pruning neural networks reduces inference time and memory costs. On standard hardware, these benefits will be especially prominent if coarse-grained structures, like feature maps, are pruned. We devise two novel saliency-based methods for second-order structured pruning (SOSP) which include correlations among all structures and layers. Our main method SOSP-H employs an innovative second-order approximation, which enables saliency evaluations by fast Hessian-vector products. SOSP-H thereby… 


SNIP: Single-shot Network Pruning based on Connection Sensitivity
This work presents a new approach that prunes a given network once at initialization prior to training, and introduces a saliency criterion based on connection sensitivity that identifies structurally important connections in the network for the given task.
HRank: Filter Pruning Using High-Rank Feature Map
This paper proposes a novel filter pruning method by exploring the High Rank of feature maps (HRank), inspired by the discovery that the average rank of multiple feature maps generated by a single filter is always the same, regardless of the number of image batches CNNs receive.
Importance Estimation for Neural Network Pruning
A novel method that estimates the contribution of a neuron (filter) to the final loss and iteratively removes those with smaller scores and two variations of this method using the first and second-order Taylor expansions to approximate a filter's contribution are described.
Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration
Unlike previous methods, FPGM compresses CNN models by pruning filters with redundancy, rather than those with“relatively less” importance, and when applied to two image classification benchmarks, the method validates its usefulness and strengths.
Towards Efficient Model Compression via Learned Global Ranking
A global ranking of the filters across different layers of the ConvNet is proposed, which is used to obtain a set of ConvNet architectures that have different accuracy/latency trade-offs by pruning the bottom-ranked filters.
Group Fisher Pruning for Practical Network Compression
A general channel pruning approach that can be applied to various complicated structures, and particularly, a layer grouping algorithm to find coupled channels automatically and derives a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels.
Towards Optimal Structured CNN Pruning via Generative Adversarial Learning
  • Shaohui Lin, R. Ji, +5 authors D. Doermann
  • Computer Science
    2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
This paper proposes an effective structured pruning approach that jointly prunes filters as well as other structures in an end-to-end manner and effectively solves the optimization problem by generative adversarial learning (GAL), which learns a sparse soft mask in a label-free and an end to end manner.
Faster gaze prediction with dense networks and Fisher pruning
Through a combination of knowledge distillation and Fisher pruning, this paper obtains much more runtime-efficient architectures for saliency prediction, achieving a 10x speedup for the same AUC performance as a state of the art network on the CAT2000 dataset.
EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning
A pruning method called EagleEye is presented, in which a simple yet efficient evaluation component based on adaptive batch normalization is applied to unveil a strong correlation between different pruned DNN structures and their final settled accuracy.
Variational Convolutional Neural Network Pruning
Variational technique is introduced to estimate distribution of a newly proposed parameter, called channel saliency, based on which redundant channels can be removed from model via a simple criterion, and results in significant size reduction and computation saving.