Convolutional Networks with Adaptive Inference Graphs

@inproceedings{Veit2018ConvolutionalNW,
  title={Convolutional Networks with Adaptive Inference Graphs},
  author={Andreas Veit and Serge J. Belongie},
  booktitle={ECCV},
  year={2018}
}
Do convolutional networks really need a fixed feed-forward structure? What if, after identifying the high-level concept of an image, a network could move directly to a layer that can distinguish fine-grained differences? Currently, a network would first need to execute sometimes hundreds of intermediate layers that specialize in unrelated aspects. Ideally, the more a network already knows about an image, the better it should be at deciding which layer to compute next. In this work, we propose… Expand
Improved Techniques for Training Adaptive Deep Networks
TLDR
This paper considers a typical adaptive deep network with multiple intermediate classifiers and presents three techniques to improve its training efficacy from two aspects: a Gradient Equilibrium algorithm to resolve the conflict of learning of different classifiers; an Inline Subnetwork Collaboration approach and a One-for-all Knowledge Distillation algorithm to enhance the collaboration among classifiers. Expand
You Look Twice: GaterNet for Dynamic Filter Selection in CNNs
TLDR
This paper investigates input-dependent dynamic filter selection in deep convolutional neural networks (CNNs) and proposes a novel yet simple framework called GaterNet, which involves a backbone and a gater network. Expand
Data Agnostic Filter Gating for Efficient Deep Networks
TLDR
This paper proposes a data agnostic filter pruning method that uses an auxiliary network named Dagger module to induce pruning and takes pretrained weights as input to learn the importance of each filter. Expand
Efficient adaptive inference for deep convolutional neural networks using hierarchical early exits
TLDR
The BoF model is extended and adapted to the needs of early exits by constructing additive shared histogram spaces that gradually refine the information extracted from the various layers of a network, in a hierarchical manner, while also employing a classification layer reuse strategy to further reduce the number of parameters needed per exit layer. Expand
Distilled Hierarchical Neural Ensembles with Adaptive Inference Cost
TLDR
HNE is proposed, a novel framework to embed an ensemble of multiple networks by sharing intermediate layers using a hierarchical structure and its second contribution is a novel co-distillation method to boost the performance of ensemble predictions with low inference cost. Expand
Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers
TLDR
The proposed method, Neural Function Modules (NFM), aims to introduce the same structural capability into deep learning by combining attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm which improves the results in standard classification, out-of-domain generalization, generative modeling, and learning representations in the context of reinforcement learning. Expand
Batch-Shaped Channel Gated Networks
TLDR
The method can slim down large architectures conditionally, such that the average computational cost on the data is on par with a smaller architecture, but with higher accuracy, and the resulting networks automatically learn to use more features for difficult examples and fewer features for simple examples. Expand
Learning to Compose Hypercolumns for Visual Correspondence
TLDR
A novel approach to visual correspondence that dynamically composes effective features by leveraging relevant layers conditioned on the images to match by selecting a small number of relevant layers from a deep convolutional neural network is introduced. Expand
Joint Spatial and Layer Attention for Convolutional Networks
Joint Spatial and Layer Attention for Convolutional Networks Tony Joseph Faculty of Science (Computer Science) University of Ontario Institute of Technology. 2019 In this work, we propose a novelExpand
SONG: Self-Organizing Neural Graphs
TLDR
An extensive theoretical study of SONG is provided, complemented by experiments conducted on Letter, Connect4, MNIST, CIFAR, and TinyImageNet datasets, showing that the method performs on par or better than existing decision models. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 44 REFERENCES
Multi-Scale Dense Convolutional Networks for Efficient Prediction
TLDR
A new convolutional neural network architecture with the ability to adapt dynamically to computational resource limits at test time and substantially improves the state-of-the-art in both settings is introduced. Expand
Densely Connected Convolutional Networks
TLDR
The Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion, and has several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. Expand
Residual Networks Behave Like Ensembles of Relatively Shallow Networks
TLDR
This work proposes a novel interpretation of residual networks showing that they can be seen as a collection of many paths of differing length, and reveals one of the key characteristics that seem to enable the training of very deep networks: Residual networks avoid the vanishing gradient problem by introducing short paths which can carry gradient throughout the extent of veryDeep networks. Expand
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
TLDR
This work introduces a Sparsely-Gated Mixture-of-Experts layer (MoE), consisting of up to thousands of feed-forward sub-networks, and applies the MoE to the tasks of language modeling and machine translation, where model capacity is critical for absorbing the vast quantities of knowledge available in the training corpora. Expand
Identity Mappings in Deep Residual Networks
TLDR
The propagation formulations behind the residual building blocks suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation. Expand
Neural Module Networks
TLDR
A procedure for constructing and learning neural module networks, which compose collections of jointly-trained neural "modules" into deep networks for question answering, and uses these structures to dynamically instantiate modular networks (with reusable components for recognizing dogs, classifying colors, etc.). Expand
Wide Residual Networks
TLDR
This paper conducts a detailed experimental study on the architecture of ResNet blocks and proposes a novel architecture where the depth and width of residual networks are decreased and the resulting network structures are called wide residual networks (WRNs), which are far superior over their commonly used thin and very deep counterparts. Expand
Visualizing and Understanding Convolutional Networks
TLDR
A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark. Expand
Learning Multiple Layers of Features from Tiny Images
TLDR
It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network. Expand
BranchyNet: Fast inference via early exiting from deep neural networks
TLDR
The BranchyNet architecture is presented, a novel deep network architecture that is augmented with additional side branch classifiers that can both improve accuracy and significantly reduce the inference time of the network. Expand
...
1
2
3
4
5
...