Corpus ID: 211032279

A Deep Conditioning Treatment of Neural Networks

@inproceedings{Agarwal2021ADC,
  title={A Deep Conditioning Treatment of Neural Networks},
  author={Naman Agarwal and Pranjal Awasthi and S. Kale},
  booktitle={ALT},
  year={2021}
}
We study the role of depth in training randomly initialized overparameterized neural networks. We give a general result showing that depth improves trainability of neural networks by improving the conditioning of certain kernel matrices of the input data. This result holds for arbitrary non-linear activation functions under a certain normalization. We provide versions of the result that hold for training just the top layer of the neural network, as well as for training all layers, via the… Expand
2 Citations
Hardness of Learning Neural Networks with Natural Weights
  • 5
  • PDF
From Local Pseudorandom Generators to Hardness of Learning
  • PDF

References

SHOWING 1-10 OF 89 REFERENCES
An Improved Analysis of Training Over-parameterized Deep Neural Networks
  • 76
  • Highly Influential
  • PDF
Disentangling trainability and generalization in deep learning
  • 18
A Convergence Theory for Deep Learning via Over-Parameterization
  • 509
  • Highly Influential
  • PDF
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
  • 299
  • PDF
Convergence Analysis of Two-layer Neural Networks with ReLU Activation
  • 348
  • PDF
Learning Polynomials with Neural Networks
  • 121
  • PDF
Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks
  • 239
  • PDF
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
  • 21,987
  • PDF
Toward Moderate Overparameterization: Global Convergence Guarantees for Training Shallow Neural Networks
  • 123
  • PDF
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation
  • 128
  • PDF
...
1
2
3
4
5
...