Predicting Parameters in Deep Learning

@inproceedings{Denil2013PredictingPI,
  title={Predicting Parameters in Deep Learning},
  author={Misha Denil and B. Shakibi and Laurent Dinh and Marc'Aurelio Ranzato and N. D. Freitas},
  booktitle={NIPS},
  year={2013}
}
We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of them need not be learned at all. We train several different architectures by learning only a small number of weights and predicting the rest. In the best case we are able to predict more than 95% of… Expand
845 Citations
Learning the Architecture of Deep Neural Networks
  • 10
  • Highly Influenced
  • PDF
Learning Neural Network Architectures using Backpropagation
  • 21
  • PDF
In Teacher We Trust: Learning Compressed Models for Pedestrian Detection
  • 28
  • PDF
Data-free Parameter Pruning for Deep Neural Networks
  • 310
  • PDF
Learning the Number of Neurons in Deep Networks
  • 260
  • PDF
Fast learning in Deep Neural Networks
  • 29
Log-sum enhanced sparse deep neural network
Tensorizing Neural Networks
  • 472
  • PDF
In Teacher We Trust: Deep Network Compression for Pedestrian Detection
  • PDF
Deep Mixture of Experts via Shallow Embedding
  • 22
  • PDF
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 39 REFERENCES
An Analysis of Single-Layer Networks in Unsupervised Feature Learning
  • 2,121
  • PDF
Deep Learning of Representations: Looking Forward
  • 451
  • PDF
Selecting Receptive Fields in Deep Networks
  • 212
  • PDF
Maxout Networks
  • 1,646
  • PDF
Scalable stacking and learning for building deep architectures
  • L. Deng, Dong Yu, John C. Platt
  • Computer Science
  • 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2012
  • 187
  • PDF
Building high-level features using large scale unsupervised learning
  • 1,977
  • PDF
ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning
  • 290
  • Highly Influential
  • PDF
Large Scale Distributed Deep Networks
  • 2,622
  • PDF
Acoustic Modeling Using Deep Belief Networks
  • 1,555
  • PDF
...
1
2
3
4
...