Corpus ID: 53556443

Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization

@article{Mostafa2019ParameterET,
  title={Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization},
  author={Hesham Mostafa and Xin Wang},
  journal={ArXiv},
  year={2019},
  volume={abs/1902.05967}
}
Modern deep neural networks are typically highly overparameterized. [...] Key Result Our work suggests that exploring structural degrees of freedom during training is more effective than adding extra parameters to the network.Expand
Dynamic parameter reallocation improves trainability of deep convolutional networks
TLDR
It is shown that neither the structure, nor the initialization of the discovered highperformance subnetwork is sufficient to explain its good performance, and it is the dynamics of parameter reallocation that are responsible for successful learning. Expand
Dynamic Pruning of a Neural Network via Gradient Signal-to-Noise Ratio
While training highly overparameterized neural networks is common practice in deep learning, research into post-hoc weight-pruning suggests that more than 90% of parameters can be removed withoutExpand
Efficient and effective training of sparse recurrent neural networks
TLDR
This paper introduces a method to train intrinsically sparse RNN models with a fixed number of parameters and floating-point operations (FLOPs) during training, and demonstrates state-of-the-art sparse performance with long short-term memory and recurrent highway networks on widely used tasks, language modeling, and text classification. Expand
FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training with Dynamic Sparsity
TLDR
This work introduces the FreeT ickets concept, as the first solution which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin, while using for complete training only a fraction of the computational resources required by the latter. Expand
Continuous Pruning of Deep Convolutional Networks Using Selective Weight Decay
TLDR
This work introduces a new technique, Selective Weight Decay (SWD), which achieves continuous pruning throughout training and shows that SWD compares favorably to other approaches in terms of performance/parameters ratio on the CIFAR-10 and ImageNet ILSVRC2012 datasets. Expand
Improving Neural Network With Uniform Sparse Connectivity
  • Weijun Luo
  • Computer Science, Mathematics
  • IEEE Access
  • 2020
TLDR
The novel uniform sparse network (USN) with even and sparse connectivity within each layer is proposed, which is conceptually simple as a natural generalization of fully connected network with multiple improvements in accuracy, robustness and scalability. Expand
Effective Sparsification of Neural Networks with Global Sparsity Constraint
TLDR
ProbMask is proposed, which solves a natural sparsification formulation under global sparsity constraint and can outperform previous state-of-the-art methods by a significant margin, especially in the high pruning rate situation. Expand
Truly Sparse Neural Networks at Scale
TLDR
This paper introduces three novel contributions, specially designed for sparse neural networks, and is able to break the record and to train the largest neural network ever trained in terms of representational power -- reaching the bat brain size. Expand
Effective Model Sparsification by Scheduled Grow-and-Prune Methods
TLDR
A novel scheduled grow-and-prune (GaP) methodology without having to pre-train a dense model is proposed, which addresses the shortcomings of the previous work by repeatedly growing a subset of layers to dense and then pruning them back to sparse after some training. Expand
Dynamic Model Pruning with Feedback
TLDR
A novel model compression method is proposed that generates a sparse trained model without additional overhead by allowing dynamic allocation of the sparsity pattern and incorporating feedback signal to reactivate prematurely pruned weights to obtain a performant sparse model in one single training pass. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 96 REFERENCES
Data-Driven Sparse Structure Selection for Deep Neural Networks
Deep convolutional neural networks have liberated its extraordinary power on various tasks. However, it is still very challenging to deploy state-of-the-art models into real-world applications due toExpand
Data-Driven Sparse Structure Selection for Deep Neural Networks
Deep convolutional neural networks have liberated its extraordinary power on various tasks. However, it is still very challenging to deploy state-of-the-art models into real-world applications due toExpand
Wide Residual Networks
Deep residual networks were shown to be able to scale up to thousands of layers and still have improving performance. However, each fraction of a percent of improved accuracy costs nearly doublingExpand
Learning Efficient Convolutional Networks through Network Slimming
The deployment of deep convolutional neural networks (CNNs) in many real world applications is largely hindered by their high computational cost. In this paper, we propose a novel learning scheme forExpand
Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks
Recently there has been a lot of work on pruning filters from deep convolutional neural networks (CNNs) with the intention of reducing computations. The key idea is to rank the filters based on aExpand
Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks
Recently there has been a lot of work on pruning filters from deep convolutional neural networks (CNNs) with the intention of reducing computations. The key idea is to rank the filters based on aExpand
Learning both Weights and Connections for Efficient Neural Network
Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems. Also, conventional networks fix the architecture before training starts;Expand
Learning Structured Sparsity in Deep Neural Networks
High demand for computation resources severely hinders deployment of large-scale Deep Neural Networks (DNN) in resource constrained devices. In this work, we propose a Structured Sparsity LearningExpand
Learning Structured Sparsity in Deep Neural Networks
High demand for computation resources severely hinders deployment of large-scale Deep Neural Networks (DNN) in resource constrained devices. In this work, we propose a Structured Sparsity LearningExpand
Low-Cost Parameterizations of Deep Convolution Neural Networks
Convolutional Neural Networks (CNNs) filter the input data using a series of spatial convolution operators with compactly supported stencils and point-wise nonlinearities. Commonly, the convolutionExpand
...
1
2
3
4
5
...