Online Convolutional Re-parameterization

  title={Online Convolutional Re-parameterization},
  author={Mu Hu and Junyi Feng and Jiashen Hua and Baisheng Lai and Jianqiang Huang and Xiaojin Gong and Xiansheng Hua},
Structural re-parameterization has drawn increasing attention in various computer vision tasks. It aims at improv-ing the performance of deep models without introducing any inference-time cost. Though efficient during inference, such models rely heavily on the complicated training-time blocks to achieve high accuracy, leading to large extra training cost. In this paper, we present online convolutional reparameterization (OREPA), a two-stage pipeline, aiming to reduce the huge training overhead… 

Figures and Tables from this paper

YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS and has the highest accuracy 56.8% AP among all known real-time object detectors with 30 FPS

Make RepVGG Greater Again: A Quantization-aware Approach

This paper proposes a simple, robust, and effective remedy to have a quantization-friendly structure that also enjoys reparameterization benefits, and greatly bridges the gap between INT8 and FP32 accuracy for RepVGG.



RepVGG: Making VGG-style ConvNets Great Again

We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3 × 3 convolution and ReLU, while the

ExpandNets: Linear Over-parameterization to Train Compact Convolutional Networks

This paper proposes to expand each linear layer of the compact network into multiple linear layers, without adding any nonlinearity, so that the resulting expanded network can benefit from over-parameterization during training but can be compressed back to the compact one algebraically at inference.

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

A new scaling method is proposed that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient and is demonstrated the effectiveness of this method on scaling up MobileNets and ResNet.

Characterizing signal propagation to close the performance gap in unnormalized ResNets

A simple set of analysis tools to characterize signal propagation on the forward pass is proposed, and this technique preserves the signal in networks with ReLU or Swish activation functions by ensuring that the per-channel activation means do not grow with depth.

DO-Conv: Depthwise Over-Parameterized Convolutional Layer

This paper shows with extensive experiments that the mere replacement of conventional convolutional layers with DO-Conv layers boosts the performance of CNNs on many classical vision tasks, such as image classification, detection, and segmentation.

Squeeze-and-Excitation Networks

This work proposes a novel architectural unit, which is term the “Squeeze-and-Excitation” (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels and shows that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets.

Diverse Branch Block: Building a Convolution as an Inception-like Unit

A universal building block of Convolutional Neural Network (ConvNet) named Diverse Branch Block (DBB), which enhances the representational capacity of a single convolution by combining diverse branches of different scales and complexities to enrich the feature space, including sequences of convolutions, multiscale convolution, and average pooling.

Very Deep Convolutional Networks for Large-Scale Image Recognition

This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

Factorized Convolutional Neural Networks

The proposed convolutional layer is composed of a low-cost single intra-channel convolution and a linear channel projection that can effectively preserve the spatial information and maintain the accuracy with significantly less computation.

Fixup Initialization: Residual Learning Without Normalization

This work proposes fixed-update initialization (Fixup), an initialization motivated by solving the exploding and vanishing gradient problem at the beginning of training via properly rescaling a standard initialization that enables residual networks without normalization to achieve state-of-the-art performance in image classification and machine translation.