Corpus ID: 211990606

Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection

@inproceedings{Ye2020GoodSP,
  title={Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection},
  author={Mao Ye and Chengyue Gong and Lizhen Nie and Denny Zhou and Adam R. Klivans and Qiang Liu},
  booktitle={ICML},
  year={2020}
}
Recent empirical works show that large deep neural networks are often highly redundant and one can find much smaller subnetworks without a significant drop of accuracy. However, most existing methods of network pruning are empirical and heuristic, leaving it open whether good subnetworks provably exist, how to find them efficiently, and if network pruning can be provably better than direct training using gradient descent. We answer these problems positively by proposing a simple greedy… Expand
Provably Efficient Lottery Ticket Discovery
TLDR
Theoretical results demonstrate the validity of the theoretical results across a variety of architectures and datasets, including multi-layer perceptrons trained on MNIST and several deep convolutional neural network (CNN) architectures trained on CIFAR10 and ImageNet. Expand
Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough
TLDR
A greedy optimization based pruning method that has the guarantee that the discrepancy between the pruned network and the original network decays with exponentially fast rate w.r.t. the size of the pruning network, under weak assumptions that apply for most practical settings. Expand
Effective Sparsification of Neural Networks with Global Sparsity Constraint
TLDR
ProbMask is proposed, which solves a natural sparsification formulation under global sparsity constraint and can outperform previous state-of-the-art methods by a significant margin, especially in the high pruning rate situation. Expand
DyNet: Dynamic Convolution for Accelerating Convolutional Neural Networks
TLDR
A novel dynamic convolution method to adaptively generate convolution kernels based on image contents that can reduce the computation cost remarkably while maintaining the performance and verify the scalability. Expand
A Gradient Flow Framework For Analyzing Network Pruning
TLDR
A general gradient flow based framework is developed that unifies state-of-the-art importance measures through the norm of model parameters and establishes several results related to pruning models early-on in training, including magnitude-based pruning, which preserves first-order model evolution dynamics and is appropriate for pruning minimally trained models. Expand
A Probabilistic Approach to Neural Network Pruning
TLDR
This work theoretically study the performance of two pruning techniques (random and magnitudebased) on FCNs and CNNs and establishes that there exist pruned networks with expressive power within any specified bound from the target network. Expand
Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework
TLDR
This framework formulates pruning as an optimization problem and proposes two approaches to lower the cost: specializing the polynomial to ensure an accurate regression even with less training data; and employing iterative pruning and fine-tuning to collect the data faster. Expand
AdaPruner: Adaptive Channel Pruning and Effective Weights Inheritance
  • Xiangcheng Liu, Jian Cao, Hongyi Yao, Wenyu Sun, Yuan Zhang
  • Computer Science
  • ArXiv
  • 2021
TLDR
A pruning framework that adaptively determines the number of each layer’s channels as well as the wights inheritance criteria for sub-network, and AdaPruner allows to obtain pruned network quickly, accurately and efficiently, taking into account both the structure and initialization weights. Expand
Adaptive Dynamic Pruning for Non-IID Federated Learning
TLDR
An adaptive pruning scheme for edge devices in an FL system, which applies datasetaware dynamic pruning for inference acceleration on Non-IID datasets and accelerates inference by 2× (50% FLOPs reduction) while maintaining the model’s quality on edge devices is presented. Expand
BWCP: Probabilistic Learning-to-Prune Channels for ConvNets via Batch Whitening
TLDR
A probability-based pruning algorithm, called batch whitening channel pruning (BWCP), which can stochastically discard unimportant channels by modeling the probability of a channel being activated and achieving better accuracy given limited computational budgets. Expand
...
1
2
3
4
...

References

SHOWING 1-10 OF 50 REFERENCES
A mean field view of the landscape of two-layer neural networks
TLDR
A compact description of the SGD dynamics is derived in terms of a limiting partial differential equation that allows for “averaging out” some of the complexities of the landscape of neural networks and can be used to prove a general convergence result for noisy SGD. Expand
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
TLDR
ProxylessNAS is presented, which can directly learn the architectures for large-scale target tasks and target hardware platforms and apply ProxylessNAS to specialize neural architectures for hardware with direct hardware metrics (e.g. latency) and provide insights for efficient CNN architecture design. Expand
MobileNetV2: Inverted Residuals and Linear Bottlenecks
TLDR
A new mobile architecture, MobileNetV2, is described that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes and allows decoupling of the input/output domains from the expressiveness of the transformation. Expand
ImageNet: A large-scale hierarchical image database
TLDR
A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets. Expand
Searching for MobileNetV3
TLDR
This paper starts the exploration of how automated search algorithms and network design can work together to harness complementary approaches improving the overall state of the art of MobileNets. Expand
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
TLDR
This work finds that dense, randomly-initialized, feed-forward networks contain subnetworks ("winning tickets") that - when trained in isolation - reach test accuracy comparable to the original network in a similar number of iterations, and articulate the "lottery ticket hypothesis". Expand
Learning Efficient Convolutional Networks through Network Slimming
TLDR
The approach is called network slimming, which takes wide and large networks as input models, but during training insignificant channels are automatically identified and pruned afterwards, yielding thin and compact models with comparable accuracy. Expand
ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression
TLDR
ThiNet is proposed, an efficient and unified framework to simultaneously accelerate and compress CNN models in both training and inference stages, and it is revealed that it needs to prune filters based on statistics information computed from its next layer, not the current layer, which differentiates ThiNet from existing methods. Expand
Deep Residual Learning for Image Recognition
TLDR
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. Expand
MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning
  • Zechun Liu, Haoyuan Mu, +4 authors Jian Sun
  • Computer Science
  • 2019 IEEE/CVF International Conference on Computer Vision (ICCV)
  • 2019
TLDR
A novel meta learning approach for automatic channel pruning of very deep neural networks by training a PruningNet, a kind of meta network, which is able to generate weight parameters for any pruned structure given the target network. Expand
...
1
2
3
4
5
...