DEEP MEAN FIELD THEORY: LAYERWISE VARIANCE

@inproceedings{2018DEEPMF,
  title={DEEP MEAN FIELD THEORY: LAYERWISE VARIANCE},
  author={},
  year={2018}
}
  • Published 2018
A recent line of work has studied the statistical properties of neural networks to great success from a mean field theory perspective, making and verifying very precise predictions of neural network behavior and test time performance. In this paper, we build upon these works to explore two methods for taming the behaviors of random residual networks (with only fully connected layers and no batchnorm). The first method is width variation (WV), i.e. varying the widths of layers as a function of… CONTINUE READING

Figures and Tables from this paper.

References

Publications referenced by this paper.
SHOWING 1-10 OF 27 REFERENCES

Natural Gradient Works Efficiently in Learning

  • Neural Computation
  • 1998
VIEW 8 EXCERPTS
HIGHLY INFLUENTIAL

Deep Residual Learning for Image Recognition

  • 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2015
VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL

1, then e converges to a fixed point e∗ < 1, dependent on the initial data p and γ, at a rate of Θ(l−Ur+1)

Ur
  • Proof. By Cho and Saul (2009); Yang and Schoenholz
  • 2017