Aggregated Residual Transformations for Deep Neural Networks

@article{Xie2017AggregatedRT,
  title={Aggregated Residual Transformations for Deep Neural Networks},
  author={Saining Xie and Ross B. Girshick and Piotr Doll{\'a}r and Zhuowen Tu and Kaiming He},
  journal={2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2017},
  pages={5987-5995}
}
We present a simple, highly modularized network architecture for image classification. Our network is constructed by repeating a building block that aggregates a set of transformations with the same topology. Our simple design results in a homogeneous, multi-branch architecture that has only a few hyper-parameters to set. This strategy exposes a new dimension, which we call cardinality (the size of the set of transformations), as an essential factor in addition to the dimensions of depth and… Expand
Learning Strict Identity Mappings in Deep Residual Networks
TLDR
This paper proposes an architecture that allows us to automatically discard redundant layers, which produces responses that are smaller than a threshold ∊, without any loss in performance, and achieves about 80% reduction in the number of parameters. Expand
Sequentially Aggregated Convolutional Networks
TLDR
This work exploits the aggregation nature of shortcut connections at a finer architectural level and ends up with a sequentially aggregated convolutional layer that combines the benefits of both wide and deep representations by aggregating features of various depths in sequence. Expand
Learning Transferable Architectures for Scalable Image Recognition
TLDR
This paper proposes to search for an architectural building block on a small dataset and then transfer the block to a larger dataset and introduces a new regularization technique called ScheduledDropPath that significantly improves generalization in the NASNet models. Expand
Gated Convolutional Networks with Hybrid Connectivity for Image Classification
TLDR
Experimental results on CIFAR and ImageNet datasets show that HCGNet is more prominently efficient than DenseNet, and can also significantly outperform state-of-the-art networks with less complexity. Expand
Data-Driven Sparse Structure Selection for Deep Neural Networks
TLDR
A simple and effective framework to learn and prune deep models in an end-to-end manner by adding sparsity regularizations on factors, and solving the optimization problem by a modified stochastic Accelerated Proximal Gradient (APG) method. Expand
Dual Path Networks
TLDR
This work reveals the equivalence of the state-of-the-art Residual Network (ResNet) and Densely Convolutional Network (DenseNet) within the HORNN framework, and finds that ResNet enables feature re-usage while DenseNet enables new features exploration which are both important for learning good representations. Expand
Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation
TLDR
This paper proposes a ``network decomposition'' strategy, named Group-Net, in which each full-precision group can be effectively reconstructed by aggregating a set of homogeneous binary branches, and shows strong generalization to other tasks. Expand
Aggregated squeeze-and-excitation transformations for densely connected convolutional networks
TLDR
This paper proposes a lightweight Densely Connected and Inter-Sparse Convolutional Networks with aggregated Squeeze-and-Excitation transformations (DenisNet-SE), which achieves better performance than the state-of-the-art networks while requiring fewer parameters. Expand
Batch Normalization with Enhanced Linear Transformation
TLDR
This paper proposes to additionally consider each neuron's neighborhood for calculating the outputs of the linear transformation module of batch normalization, and proves that BNET accelerates the convergence of network training and enhances spatial information by assigning the important neurons with larger weights accordingly. Expand
Rethinking Binary Neural Network for Accurate Image Classification and Semantic Segmentation
TLDR
This paper proposes to train a network with both binary weights and binary activations, designed specifically for mobile devices with limited computation capacity and power consumption, and claims that considering both value and structure approximation should be the future development direction of BNNs. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 59 REFERENCES
Wide Residual Networks
TLDR
This paper conducts a detailed experimental study on the architecture of ResNet blocks and proposes a novel architecture where the depth and width of residual networks are decreased and the resulting network structures are called wide residual networks (WRNs), which are far superior over their commonly used thin and very deep counterparts. Expand
Understanding Deep Architectures using a Recursive Convolutional Network
TLDR
The notion that adding layers alone increases computational power, within the context of convolutional layers is empirically confirmed and the number of feature maps appears ancillary, and finds most of its benefit through the introduction of more weights. Expand
Going deeper with convolutions
We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual RecognitionExpand
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
TLDR
This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities. Expand
Identity Mappings in Deep Residual Networks
TLDR
The propagation formulations behind the residual building blocks suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation. Expand
Very Deep Convolutional Networks for Large-Scale Image Recognition
TLDR
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. Expand
Deep Residual Learning for Image Recognition
TLDR
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. Expand
DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
TLDR
DeCAF, an open-source implementation of deep convolutional activation features, along with all associated network parameters, are released to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms. Expand
Deep Roots: Improving CNN Efficiency with Hierarchical Filter Groups
We propose a new method for creating computationally efficient and compact convolutional neural networks (CNNs) using a novel sparse connection structure that resembles a tree root. This allows aExpand
Rethinking the Inception Architecture for Computer Vision
TLDR
This work is exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization. Expand
...
1
2
3
4
5
...