Wide Residual Networks

@article{Zagoruyko2016WideRN,
  title={Wide Residual Networks},
  author={Sergey Zagoruyko and N. Komodakis},
  journal={ArXiv},
  year={2016},
  volume={abs/1605.07146}
}
Deep residual networks were shown to be able to scale up to thousands of layers and still have improving performance. [...] Key Method We call the resulting network structures wide residual networks (WRNs) and show that these are far superior over their commonly used thin and very deep counterparts. For example, we demonstrate that even a simple 16-layer-deep wide residual network outperforms in accuracy and efficiency all previous deep residual networks, including thousand-layer-deep networks, achieving new…Expand
Multi-Residual Networks: Improving the Speed and Accuracy of Residual Networks
TLDR
A new convolutional neural network architecture is proposed which builds upon the success of residual networks by explicitly exploiting the interpretation of very deep networks as an ensemble, and generates models that are wider, rather than deeper, which significantly improves accuracy. Expand
Weighted residuals for very deep networks
  • Falong Shen, Gang Zeng
  • Computer Science
  • 2016 3rd International Conference on Systems and Informatics (ICSAI)
  • 2016
TLDR
A weighted residual network is introduced to address the incompatibility between ReLU and element-wise addition and the deep network initialization problem and is able to learn to combine residuals from different layers effectively and efficiently. Expand
Reducing parameter number in residual networks by sharing weights
TLDR
A way to reduce the redundant information of deep residual networks by sharing the weights of convolutional layers between residual blocks operating at the same spatial scale, which shows that they are almost as efficient as their sequential counterparts while involving less parameters. Expand
Deep Pyramidal Residual Networks
TLDR
This research gradually increases the feature map dimension at all units to involve as many locations as possible in the network architecture and proposes a novel residual unit capable of further improving the classification accuracy with the new network architecture. Expand
cient Recurrent Residual Networks Improved by Feature Transfer MSc Thesis
Over the past several years, deep and wide neural networks have achieved great success in many tasks. However, in real life applications, because the gains usually come at a cost in terms of theExpand
Residual Networks of Residual Networks: Multilevel Residual Networks
TLDR
A novel residual network architecture, residual networks of residual networks (RoR) is proposed, to dig the optimization ability of residual Networks, where RoR substitutes optimizing residual mapping of residual mapping for optimizing original residual mapping. Expand
Improved Highway Network Block for Training Very Deep Neural Networks
TLDR
The proposed highway networks, besides being more computationally efficient, are shown to have more interesting learning characteristics such as natural learning of hierarchical and robust representations due to a more effective usage of model depth, fewer gates for successful learning, better generalization capacity and faster convergence than the original highway network. Expand
Sequentially Aggregated Convolutional Networks
TLDR
This work exploits the aggregation nature of shortcut connections at a finer architectural level and ends up with a sequentially aggregated convolutional layer that combines the benefits of both wide and deep representations by aggregating features of various depths in sequence. Expand
PolyNet: A Pursuit of Structural Diversity in Very Deep Networks
TLDR
This work presents a new family of modules, namely the PolyInception, which can be flexibly inserted in isolation or in a composition as replacements of different parts of a network, and demonstrates substantial improvements over the state-of-the-art on the ILSVRC 2012 benchmark. Expand
Multi-Residual Networks
TLDR
The effective range of ensembles is examined by introducing multi-residual networks that significantly improve classification accuracy of residual networks and obtains a test error rate of 3.92 on CIFAR-10 that outperforms all existing models. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 40 REFERENCES
Deep Networks with Stochastic Depth
TLDR
Stochastic depth is proposed, a training procedure that enables the seemingly contradictory setup to train short networks and use deep networks at test time and reduces training time substantially and improves the test error significantly on almost all data sets that were used for evaluation. Expand
Deep Residual Learning for Image Recognition
TLDR
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. Expand
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
Very deep convolutional networks have been central to the largest advances in image recognition performance in recent years. One example is the Inception architecture that has been shown to achieveExpand
Identity Mappings in Deep Residual Networks
TLDR
The propagation formulations behind the residual building blocks suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation. Expand
FitNets: Hints for Thin Deep Nets
TLDR
This paper extends the idea of a student network that could imitate the soft output of a larger teacher network or ensemble of networks, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student. Expand
Very Deep Convolutional Networks for Large-Scale Image Recognition
TLDR
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. Expand
Understanding the difficulty of training deep feedforward neural networks
TLDR
The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future. Expand
ImageNet classification with deep convolutional neural networks
TLDR
A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective. Expand
Deeply-Supervised Nets
TLDR
The proposed deeply-supervised nets (DSN) method simultaneously minimizes classification error while making the learning process of hidden layers direct and transparent, and extends techniques from stochastic gradient methods to analyze the algorithm. Expand
Dropout: a simple way to prevent neural networks from overfitting
TLDR
It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets. Expand
...
1
2
3
4
...