Training Faster by Separating Modes of Variation in Batch-Normalized Models

@article{Kalayeh2020TrainingFB,
  title={Training Faster by Separating Modes of Variation in Batch-Normalized Models},
  author={M. Kalayeh and M. Shah},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2020},
  volume={42},
  pages={1483-1500}
}
  • M. Kalayeh, M. Shah
  • Published 2020
  • Computer Science, Mathematics, Medicine
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Batch Normalization (BN) is essential to effectively train state-of-the-art deep Convolutional Neural Networks (CNN). It normalizes the layer outputs during training using the statistics of each mini-batch. BN accelerates training procedure by allowing to safely utilize large learning rates and alleviates the need for careful initialization of the parameters. In this work, we study BN from the viewpoint of Fisher kernels that arise from generative probability models. We show that assuming… CONTINUE READING
    9 Citations
    REGULARIZING ACTIVATIONS IN NEURAL NETWORKS VIA DISTRIBUTION MATCHING WITH THE WASSER-
    • STEIN METRIC
    • 2019
    Regularizing activations in neural networks via distribution matching with the Wasserstein metric
    • 3
    • PDF
    Attentive Normalization
    • 3
    • PDF
    Split Batch Normalization: Improving Semi-Supervised Learning under Domain Shift
    • 4
    • PDF
    Mode Normalization
    • 11
    • PDF
    Normalization Techniques in Training DNNs: Methodology, Analysis and Application
    • PDF
    Beyond Grids: Learning Graph Representations for Visual Recognition
    • 46
    • PDF
    Describing Images by Semantic Modeling using Attributes and Tags
    Analysis of open data of a social network in order to identify deviant communities
    • Rostislav Mikherskii
    • 2020
    • PDF

    References

    SHOWING 1-10 OF 57 REFERENCES
    Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
    • 20,767
    • Highly Influential
    • PDF
    Group Normalization
    • 808
    • Highly Influential
    • PDF
    Layer Normalization
    • 1,793
    • Highly Influential
    • PDF
    Normalizing the Normalizers: Comparing and Extending Network Normalization Schemes
    • 52
    • Highly Influential
    • PDF
    Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models
    • S. Ioffe
    • Computer Science, Mathematics
    • NIPS
    • 2017
    • 253
    • Highly Influential
    • PDF
    Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
    • 935
    • PDF
    Network In Network
    • 3,222
    • PDF
    Deep Networks with Stochastic Depth
    • 910
    • PDF
    On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
    • 1,110
    • PDF
    Learning Multiple Layers of Features from Tiny Images
    • 10,001
    • Highly Influential
    • PDF