Norm-Preservation: Why Residual Networks Can Become Extremely Deep?

  title={Norm-Preservation: Why Residual Networks Can Become Extremely Deep?},
  author={Alireza Zaeemzadeh and N. Rahnavard and M. Shah},
  journal={IEEE transactions on pattern analysis and machine intelligence},
  • Alireza Zaeemzadeh, N. Rahnavard, M. Shah
  • Published 2020
  • Computer Science, Medicine
  • IEEE transactions on pattern analysis and machine intelligence
  • Augmenting neural networks with skip connections, as introduced in the so-called ResNet architecture, surprised the community by enabling the training of networks of more than 1,000 layers with significant performance gains. This paper deciphers ResNet by analyzing the effect of skip connections, and puts forward new theoretical results on the advantages of identity skip connections in neural networks. We prove that the skip connections in the residual blocks facilitate preserving the norm of… CONTINUE READING
    Improving Deep Transformer with Depth-Scaled Initialization and Merged Attention
    • 27
    • PDF
    GResNet: Graph Residual Network for Reviving Deep GNNs from Suspended Animation
    • 12
    • PDF
    Unsupervised Domain Adaptation: An Adaptive Feature Norm Approach
    • 7
    • PDF
    A Generic Improvement to Deep Residual Networks Based on Gradient Flow
    Identity Connections in Residual Nets Improve Noise Stability
    W-Cell-Net: Multi-frame Interpolation of Cellular Microscopy Videos


    Publications referenced by this paper.
    Deep Residual Learning for Image Recognition
    • 50,533
    • Highly Influential
    • PDF
    Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
    • 18,924
    • PDF
    Identity Mappings in Deep Residual Networks
    • 3,892
    • Highly Influential
    • PDF
    Understanding the difficulty of training deep feedforward neural networks
    • 8,543
    • Highly Influential
    • PDF
    Densely Connected Convolutional Networks
    • 9,761
    • PDF
    Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
    • 8,034
    • Highly Influential
    • PDF
    Identity Matters in Deep Learning
    • 221
    • PDF
    Residual Networks Behave Like Ensembles of Relatively Shallow Networks
    • 441
    • PDF
    The Shattered Gradients Problem: If resnets are the answer, then what is the question?
    • 127
    • PDF
    Learning Multiple Layers of Features from Tiny Images
    • 9,107
    • PDF