Corpus ID: 167217261

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

@article{Tan2019EfficientNetRM,
  title={EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks},
  author={Mingxing Tan and Quoc V. Le},
  journal={ArXiv},
  year={2019},
  volume={abs/1905.11946}
}
Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available. [...] Key MethodTo go even further, we use neural architecture search to design a new baseline network and scale it up to obtain a family of models, called EfficientNets, which achieve much better accuracy and efficiency than previous ConvNets. In particular, our EfficientNet-B7 achieves state-of-the-art 84.4% top-1 / 97.1% top-5 accuracy on ImageNet…Expand
Greedy Network Enlarging
TLDR
This paper proposes an greedy network enlarging method based on the reallocation of computations to enlarge the capacity of CNN models by improving their width, depth and resolution on stage level. Expand
Depth-Wise Neural Architecture Search
TLDR
This work proposes a NAS approach to efficiently design accurate and low-cost convolutional architectures and demonstrates that an efficient strategy for designing these architectures is to learn the depth stage-by-stage, such that stages with low importance are kept shallow while stages with high importance become deeper. Expand
ThriftyNets : Convolutional Neural Networks with Tiny Parameter Budget
TLDR
In an effort to use every parameter of a network at its maximum, a new convolutional neural network architecture is proposed, called ThriftyNet, which achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40K parameters in total. Expand
Dynamic Resolution Network
TLDR
This paper proposes a novel dynamic-resolution network (DRNet) in which the input resolution is determined dynamically based on each input sample, and learns the smallest resolution that can retain and even exceed the original recognition accuracy for each image. Expand
NeuralScale: Efficient Scaling of Neurons for Resource-Constrained Deep Neural Networks
  • Eugene Lee, Chen-Yi Lee
  • Computer Science
  • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
TLDR
This work attempts to search for the neuron (filter) configuration of a fixed network architecture that maximizes accuracy using iterative pruning methods as a proxy, and introduces architecture descent which iteratively refines the parametrized function used for model scaling. Expand
Revisiting ResNets: Improved Training and Scaling Strategies
TLDR
It is found that training and scaling strategies may matter more than architectural changes, and further, that the resulting ResNets match recent state-of-the-art models. Expand
Fast and Accurate Model Scaling
TLDR
This work proposes a simple fast compound scaling strategy that encourages primarily scaling model width, while scaling depth and resolution to a lesser extent, and provides a framework for analyzing scaling strategies under various computational constraints. Expand
Scale Calibrated Training: Improving Generalization of Deep Networks via Scale-Specific Normalization
TLDR
A novel normalization scheme called Scale-Specific Batch Normalization is equipped to SCT in replacement of batch normalization, which improves accuracy of single Resnet-50 on ImageNet by 1.7% and 11.5% accuracy when testing on image sizes of 224 and 128 respectively. Expand
Neural Epitome Search for Architecture-Agnostic Network Compression
TLDR
This work presents a novel auto-sampling method that is applicable to both 1D and 2D CNNs with significant performance improvement over WSNet and outperforms some neural architecture search (NAS) based methods such asAMC and MNasNet. Expand
A closer look at network resolution for efficient network design
TLDR
This paper proposes a framework to mutually learn from different input resolutions and network widths that achieves consistently better ImageNet top-1 accuracy over US-Net under different computation constraints, and outperforms the best compound scale model of EfficientNet by 1.5%. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 56 REFERENCES
Rethinking the Inception Architecture for Computer Vision
TLDR
This work is exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization. Expand
PolyNet: A Pursuit of Structural Diversity in Very Deep Networks
TLDR
This work presents a new family of modules, namely the PolyInception, which can be flexibly inserted in isolation or in a composition as replacements of different parts of a network, and demonstrates substantial improvements over the state-of-the-art on the ILSVRC 2012 benchmark. Expand
MnasNet: Platform-Aware Neural Architecture Search for Mobile
TLDR
An automated mobile neural architecture search (MNAS) approach, which explicitly incorporate model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency. Expand
Aggregated Residual Transformations for Deep Neural Networks
TLDR
On the ImageNet-1K dataset, it is empirically show that even under the restricted condition of maintaining complexity, increasing cardinality is able to improve classification accuracy and is more effective than going deeper or wider when the authors increase the capacity. Expand
Progressive Neural Architecture Search
We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionaryExpand
Going deeper with convolutions
We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual RecognitionExpand
Deep Networks with Stochastic Depth
TLDR
Stochastic depth is proposed, a training procedure that enables the seemingly contradictory setup to train short networks and use deep networks at test time and reduces training time substantially and improves the test error significantly on almost all data sets that were used for evaluation. Expand
ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
TLDR
This work proposes to evaluate the direct metric on the target platform, beyond only considering FLOPs, and derives several practical guidelines for efficient network design, called ShuffleNet V2. Expand
On the Expressive Power of Overlapping Architectures of Deep Learning
TLDR
This work theoretically analyzes the effect of "overlaps" in the convolutional process, and shows that having overlapping local receptive fields, and more broadly denser connectivity, results in an exponential increase in the expressive capacity of neural networks. Expand
Learning Transferable Architectures for Scalable Image Recognition
TLDR
This paper proposes to search for an architectural building block on a small dataset and then transfer the block to a larger dataset and introduces a new regularization technique called ScheduledDropPath that significantly improves generalization in the NASNet models. Expand
...
1
2
3
4
5
...