Corpus ID: 6844431

Dropout: a simple way to prevent neural networks from overfitting

  title={Dropout: a simple way to prevent neural networks from overfitting},
  author={Nitish Srivastava and Geoffrey E. Hinton and A. Krizhevsky and Ilya Sutskever and R. Salakhutdinov},
  journal={J. Mach. Learn. Res.},
  • Nitish Srivastava, Geoffrey E. Hinton, +2 authors R. Salakhutdinov
  • Published 2014
  • Computer Science
  • J. Mach. Learn. Res.
  • Deep neural nets with a large number of parameters are very powerful machine learning systems. [...] Key Method During training, dropout samples from an exponential number of different "thinned" networks. At test time, it is easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. This significantly reduces overfitting and gives major improvements over other regularization methods. We show that dropout improves…Expand Abstract
    19,988 Citations
    The Effect Of Hyperparameters In The Activation Layers Of Deep Neural Networks
    Curriculum Dropout
    • 30
    • Highly Influenced
    • PDF
    Learning Compact Architectures for Deep Neural Networks
    Modern Neural Networks Generalize on Small Data Sets
    • 28
    • PDF
    Controlled dropout: A different dropout for improving training speed on deep neural network
    • 7
    A Survey of Sparse-Learning Methods for Deep Neural Networks
    • Rongrong Ma, L. Niu
    • Computer Science
    • 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)
    • 2018
    • 1


    Improving Neural Networks with Dropout
    • 193
    • PDF
    Learning Multiple Layers of Features from Tiny Images
    • 9,321
    • PDF
    Fast dropout training
    • 312
    • PDF
    Bayesian learning for neural networks
    • 3,197
    • PDF
    A Fast Learning Algorithm for Deep Belief Nets
    • 11,225
    • PDF
    Learning with Marginalized Corrupted Features
    • 134
    • PDF
    Simplifying Neural Networks by Soft Weight-Sharing
    • 568
    Dropout Training as Adaptive Regularization
    • 397
    • PDF
    Deep Boltzmann Machines
    • 1,747
    • PDF
    Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion
    • 4,515
    • PDF