# Conflicting Bundles: Adapting Architectures Towards the Improved Training of Deep Neural Networks

@article{Peer2021ConflictingBA, title={Conflicting Bundles: Adapting Architectures Towards the Improved Training of Deep Neural Networks}, author={David Peer and Sebastian Stabinger and Antonio Jose Rodr{\'i}guez-S{\'a}nchez}, journal={2021 IEEE Winter Conference on Applications of Computer Vision (WACV)}, year={2021}, pages={256-265} }

Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel theory and metric to identify layers that decrease the test accuracy of the trained models, this identification is done as early as at the beginning of training. In the worst-case, such a layer could lead to a network that can not be trained at all. More precisely, we identified those layers… Expand

#### 4 Citations

Orchid: Building Dynamic Test Oracles with Training Bias for Improving Deep Neural Network Models †

- 2021 8th International Conference on Dependable Systems and Their Applications (DSA)
- 2021

The accuracy of deep neural network models is always a top priority in developing these models. One problem to affect it is to what extent such a model can resolve training samples conflicting with… Expand

Training Deep Capsule Networks with Residual Connections

- Computer Science
- ICANN
- 2021

This paper proposes a methodology to train deeper capsule networks using residual connections, which is evaluated on four datasets and three different routing algorithms and shows that in fact, performance increases when training deeper capsules networks. Expand

Greedy Layer Pruning: Decreasing Inference Time of Transformer Models

- Computer Science
- ArXiv
- 2021

Fine-tuning transformer models after unsupervised pre-training reaches a very high performance on many different NLP tasks. Unfortunately, transformers suffer from long inference times which greatly… Expand

conflicting_bundle.py - A python module to identify problematic layers in deep neural networks

- Computer Science
- Softw. Impacts
- 2021

#### References

SHOWING 1-10 OF 34 REFERENCES

Data-dependent Initializations of Convolutional Neural Networks

- Computer Science
- ICLR
- 2016

This work presents a fast and simple data-dependent initialization procedure, that sets the weights of a network such that all units in the network train at roughly the same rate, avoiding vanishing or exploding gradients. Expand

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

- Computer Science
- ICML
- 2015

Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Expand

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

- Computer Science, Mathematics
- ICML
- 2019

A new scaling method is proposed that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient and is demonstrated the effectiveness of this method on scaling up MobileNets and ResNet. Expand

On the Expressive Power of Deep Neural Networks

- Computer Science, Mathematics
- ICML
- 2017

We propose a new approach to the problem of neural network expressivity, which seeks to characterize how structural properties of a neural network family affect the functions it is able to compute.… Expand

The Shattered Gradients Problem: If resnets are the answer, then what is the question?

- Computer Science, Mathematics
- ICML
- 2017

It is shown that the correlation between gradients in standard feedforward networks decays exponentially with depth resulting in gradients that resemble white noise whereas, in contrast, thegradients in architectures with skip-connections are far more resistant to shattering, decaying sublinearly. Expand

Understanding the difficulty of training deep feedforward neural networks

- Computer Science, Mathematics
- AISTATS
- 2010

The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future. Expand

Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach

- Computer Science, Mathematics
- 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017

It is proved that, when ReLU is the only non-linearity, the loss curvature is immune to class-dependent label noise, and it is shown how one can estimate these probabilities, adapting a recent technique for noise estimation to the multi-class setting, and providing an end-to-end framework. Expand

Deep Residual Learning for Image Recognition

- Computer Science
- 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. Expand

Residual Networks Behave Like Ensembles of Relatively Shallow Networks

- Computer Science
- NIPS
- 2016

This work proposes a novel interpretation of residual networks showing that they can be seen as a collection of many paths of differing length, and reveals one of the key characteristics that seem to enable the training of very deep networks: Residual networks avoid the vanishing gradient problem by introducing short paths which can carry gradient throughout the extent of veryDeep networks. Expand

Highway Networks

- Computer Science
- ArXiv
- 2015

A new architecture designed to ease gradient-based training of very deep networks, characterized by the use of gating units which learn to regulate the flow of information through a network is introduced. Expand