Corpus ID: 6666117

Data-Dependent Path Normalization in Neural Networks

@article{Neyshabur2016DataDependentPN,
  title={Data-Dependent Path Normalization in Neural Networks},
  author={Behnam Neyshabur and Ryota Tomioka and R. Salakhutdinov and Nathan Srebro},
  journal={CoRR},
  year={2016},
  volume={abs/1511.06747}
}
We propose a unified framework for neural net normalization, regularization and optimization, which includes Path-SGD and Batch-Normalization and interpolates between them across two different dimensions. Through this framework we investigate issue of invariance of the optimization, data dependence and the connection with natural gradients. 
19 Citations

Paper Mentions

Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations
  • 25
  • PDF
Exploring Generalization in Deep Learning
  • 521
  • PDF
Scaling-Based Weight Normalization for Deep Neural Networks
  • 6
  • Highly Influenced
  • PDF
Visualizing Deep Network Training Trajectories with PCA
  • 11
  • PDF
Centered Weight Normalization in Accelerating Training of Deep Neural Networks
  • 36
  • PDF
The Implicit Biases of Stochastic Gradient Descent on Deep Neural Networks with Batch Normalization
  • PDF
Implicit Regularization in Deep Learning
  • 56
  • PDF
Fast ConvNets Using Group-Wise Brain Damage
  • V. Lebedev, V. Lempitsky
  • Computer Science
  • 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2016
  • 289
  • PDF
...
1
2
...

References

SHOWING 1-10 OF 22 REFERENCES
Path-SGD: Path-Normalized Optimization in Deep Neural Networks
  • 177
  • PDF
Riemannian metrics for neural networks I: feedforward networks
  • 72
  • PDF
In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning
  • 285
  • PDF
Revisiting Natural Gradient for Deep Networks
  • 226
  • PDF
Natural Neural Networks
  • 130
  • PDF
Deep learning via Hessian-free optimization
  • 749
  • PDF
On the importance of initialization and momentum in deep learning
  • 2,883
  • PDF
Krylov Subspace Descent for Deep Learning
  • 107
  • PDF
Neural Networks: Tricks of the Trade
  • 583
...
1
2
3
...