Corpus ID: 236447887

Experiments on Properties of Hidden Structures of Sparse Neural Networks

  title={Experiments on Properties of Hidden Structures of Sparse Neural Networks},
  author={Julian Stier and Harsh Darji and Michael Granitzer},
Sparsity in the structure of Neural Networks can lead to less energy consumption, less memory usage, faster computation times on convenient hardware, and automated machine learning. If sparsity gives rise to certain kinds of structure, it can explain automatically obtained features during learning. We provide insights into experiments in which we show how sparsity can be achieved through prior initialization, pruning, and during learning, and answer questions on the relationship between the… Expand

Figures and Tables from this paper


Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science
A method to design neural networks as sparse scale-free networks, which leads to a reduction in computational time required for training and inference, which has the potential to enable artificial neural networks to scale up beyond what is currently possible. Expand
Pruning recurrent neural networks for improved generalization performance
This work presents a simple pruning heuristic that significantly improves the generalization performance of trained recurrent networks and shows that rules extracted from networks trained with this heuristic are more consistent with the rules to be learned. Expand
Learning both Weights and Connections for Efficient Neural Network
A method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections, and prunes redundant connections using a three-step method. Expand
Exploring Sparsity in Recurrent Neural Networks
This work proposes a technique to reduce the parameters of a network by pruning weights during the initial training of the network, which reduces the size of the model and can also help achieve significant inference time speed-up using sparse matrix multiply. Expand
An adaptive growing and pruning algorithm for designing recurrent neural network
Experimental results show that the proposed RSONN effectively simplifies the network structure and performs better than some exiting methods. Expand
Fault tolerance of pruned multilayer networks
  • B. Segee, M.J. Carter
  • Computer Science
  • IJCNN-91-Seattle International Joint Conference on Neural Networks
  • 1991
Assessment of the tolerance of multilayer feedforward networks to the zeroing of individual weights finds the unpruned networks, which had considerably more free parameters, were found to be no more tolerant to weight zeroing than the pruned networks. Expand
Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization
This work suggests that exploring structural degrees of freedom during training is more effective than adding extra parameters to the network, and outperforms previous static and dynamic reparameterization methods, yielding the best accuracy for a fixed parameter budget. Expand
One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation
A new recurrent pruning objective derived from the spectrum of the recurrent Jacobian is introduced, which is data efficient, easy to implement, and produces 95% sparse GRUs that significantly improve on existing baselines. Expand
The State of Sparsity in Deep Neural Networks
It is shown that unstructured sparse architectures learned through pruning cannot be trained from scratch to the same test set performance as a model trained with joint sparsification and optimization, and the need for large-scale benchmarks in the field of model compression is highlighted. Expand
Correlation Analysis between the Robustness of Sparse Neural Networks and their Random Hidden Structural Priors
An empirical study with neural network models obtained through random graphs used as sparse structural priors for the networks found that robustness measures are independent to initialization methods but show weak correlations with graph properties: higher graph densities correlate with lower robustness, but higher average path lengths and average node eccentricities show negative correlations with robusts measures. Expand