Local Context Normalization: Revisiting Local Normalization

  title={Local Context Normalization: Revisiting Local Normalization},
  author={Anthony Ortiz and Caleb Robinson and Dan Morris and Olac Fuentes and Christopher Kiekintveld and Mahmudulla Hassan and Nebojsa Jojic},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
Normalization layers have been shown to improve convergence in deep neural networks, and even add useful inductive biases. In many vision applications the local spatial context of the features is important, but most common normalization schemes including Group Normalization (GN), Instance Normalization (IN), and Layer Normalization (LN) normalize over the entire spatial dimension of a feature. This can wash out important signals and degrade performance. For example, in applications that use… 
WeightAlign: Normalizing Activations by Weight Alignment
WeightAlign is presented: a method that normalizes the weights by the mean and scaled standard derivation computed within a filter, which normalizes activations without computing any sample statistics.
Representative Batch Normalization with Feature Calibration
This work proposes to add a simple yet effective feature calibration scheme into the centering and scaling operations of BatchNorm, enhancing the instance-specific representations with the negligible computational cost.
Normalization Techniques in Training DNNs: Methodology, Analysis and Application
A unified picture of the main motivation behind different approaches from the perspective of optimization is provided, and a taxonomy for understanding the similarities and differences between them is presented.
Neighborhood Normalization for Robust Geometric Feature Learning
This work introduces a new normalization technique, Batch-Neighborhood Normalization, aiming to improve robustness to mean-std variation of local feature distributions that presumably can happen in samples with varying point density.
Brain-inspired Weighted Normalization for CNN Image Classification
Weighted normalization outperformed other normalizations in image classification tasks on Cifar10, Imagenet and a customized textured MNIST dataset, and is more prominent when the CNN is shallow.
Separable Batch Normalization for Robust Facial Landmark Localization with Cross-protocol Network Training
A novel Separable Batch Normalization module with a Cross-protocol Network Training (CNT) strategy for robust facial landmark localization and a novel attention mechanism that assigns different weights to each branch for automatic selection in an effective style are presented.
A Stricter Constraint Produces Outstanding Matching: Learning More Reliable Image Matching Using a Quadratic Hinge Triplet Loss Network*
Image matching is widely used in many applications, such as visualbased localization and 3D reconstruction. Compared with traditional local features (e.g., SIFT) and outlier elimination methods
Detecting Cattle and Elk in the Wild from Space
This work proposes a baseline method, CowNet, that simultaneously estimates the number of animals in an image (counts), as well as predicts their location at a pixel level (localizes), and proposes an methodology for evaluating such models on counting and localization tasks across large scenes that takes the uncertainty of noisy labels and the information needed by stakeholders in ecological monitoring tasks into account.
Revisiting Global Statistics Aggregation for Improving Image Restoration
This paper shows that statistics aggregated on the patches-based/entire-image-based feature in the training/testing phase respectively may distribute very differently and lead to performance degradation in image restorers, and proposes a simple approach, Test-time Local Statistics Converter (TLSC), that replaces the region of statistics aggregation operation from global to local only in the test time.


Group Normalization
Group Normalization can outperform its BN-based counterparts for object detection and segmentation in COCO, and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks.
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Deep Residual Learning for Image Recognition
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Multi-Scale Context Aggregation by Dilated Convolutions
This work develops a new convolutional network module that is specifically designed for dense prediction, and shows that the presented context module increases the accuracy of state-of-the-art semantic segmentation systems.
Going deeper with convolutions
We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition
High-Resolution Representations for Labeling Pixels and Regions
A simple modification is introduced to augment the high-resolution representation by aggregating the (upsampled) representations from all the parallel convolutions rather than only the representation from thehigh-resolution convolution, which leads to stronger representations, evidenced by superior results.
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
A reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction is presented, improving the conditioning of the optimization problem and speeding up convergence of stochastic gradient descent.
Learning multiple visual domains with residual adapters
This paper develops a tunable deep network architecture that, by means of adapter residual modules, can be steered on the fly to diverse visual domains and introduces the Visual Decathlon Challenge, a benchmark that evaluates the ability of representations to capture simultaneously ten very differentVisual domains and measures their ability to recognize well uniformly.
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF).