Corpus ID: 237108365

CrossNorm and SelfNorm for Generalization under Distribution Shifts

@inproceedings{Tang2021CrossNormAS,
  title={CrossNorm and SelfNorm for Generalization under Distribution Shifts},
  author={Zhiqiang Tang and Yunhe Gao and Yi Zhu and Zhi Zhang and Mu Li and Dimitris N. Metaxas},
  year={2021}
}
  • Zhiqiang Tang, Yunhe Gao, +3 authors Dimitris N. Metaxas
  • Published 4 February 2021
  • Computer Science
Traditional normalization techniques (e.g., Batch Normalization and Instance Normalization) generally and simplistically assume that training and test data follow the same distribution. As distribution shifts are inevitable in real-world applications, well-trained models with previous normalization methods can perform badly in new environments. Can we develop new normalization methods to improve generalization robustness under distribution shifts? In this paper, we answer the question by… Expand

References

SHOWING 1-10 OF 52 REFERENCES
Instance Enhancement Batch Normalization: an Adaptive Regulator of Batch Noise
TLDR
An attention-based BN called Instance Enhancement Batch Normalization (IEBN) that recalibrates the information of each channel by a simple linear transformation that has a good capacity of regulating noise and stabilizing network training to improve generalization even in the presence of two kinds of noise attacks during training. Expand
Exemplar Normalization for Learning Deep Representation
TLDR
This work investigates a novel dynamic learning-to-normalize (L2N) problem by proposing Exemplar Normalization (EN), which is able to learn different normalization methods for different convolutional layers and image samples of a deep network. Expand
Revisiting Batch Normalization For Practical Domain Adaptation
TLDR
This paper proposes a simple yet powerful remedy, called Adaptive Batch Normalization (AdaBN) to increase the generalization ability of a DNN, and demonstrates that the method is complementary with other existing methods and may further improve model performance. Expand
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
TLDR
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Expand
Group Normalization
TLDR
Group Normalization can outperform its BN-based counterparts for object detection and segmentation in COCO, and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks. Expand
Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization Without Accessing Target Domain Data
TLDR
A new approach of domain randomization and pyramid consistency to learn a model with high generalizability for semantic segmentation of real-world self-driving scenes in a domain generalization fashion is proposed. Expand
Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net
TLDR
IBN-Net is presented, a novel convolutional architecture, which remarkably enhances a CNN’s modeling ability on one domain as well as its generalization capacity on another domain without finetuning. Expand
CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features
TLDR
Patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches, and CutMix consistently outperforms state-of-the-art augmentation strategies on CIFAR and ImageNet classification tasks, as well as on ImageNet weakly-supervised localization task. Expand
Pretrained Transformers Improve Out-of-Distribution Robustness
TLDR
This work systematically measure out-of-distribution (OOD) generalization for seven NLP datasets by constructing a new robustness benchmark with realistic distribution shifts and measures the generalization of previous models, finding that larger models are not necessarily more robust, distillation can be harmful, and more diverse pretraining data can enhance robustness. Expand
Aggregated Residual Transformations for Deep Neural Networks
TLDR
On the ImageNet-1K dataset, it is empirically show that even under the restricted condition of maintaining complexity, increasing cardinality is able to improve classification accuracy and is more effective than going deeper or wider when the authors increase the capacity. Expand
...
1
2
3
4
5
...