• Corpus ID: 167217261

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

  title={EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks},
  author={Mingxing Tan and Quoc V. Le},
Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available. [] Key MethodTo go even further, we use neural architecture search to design a new baseline network and scale it up to obtain a family of models, called EfficientNets, which achieve much better accuracy and efficiency than previous ConvNets. In particular, our EfficientNet-B7 achieves state-of-the-art 84.4% top-1 / 97.1% top-5 accuracy on ImageNet…

EfficientNetV2: Smaller Models and Faster Training

An improved method of progressive learning, which adaptively adjusts regularization (e.g., dropout and data augmentation) along with image size is proposed, which significantly outperforms previous models on ImageNet and CIFAR/Cars/Flowers datasets.

Greedy Network Enlarging

This paper proposes an greedy network enlarging method based on the reallocation of computations to enlarge the capacity of CNN models by improving their width, depth and resolution on stage level.

TResNet: High Performance GPU-Dedicated Architecture

A series of architecture modifications are introduced that aim to boost neural networks’ accuracy, while retaining their GPU training and inference efficiency, and a new family of GPU-dedicated models, called TResNet, which achieves better accuracy and efficiency than previous ConvNets1.

Network Amplification with Efficient MACs Allocation

This paper proposes to enlarge the capacity of CNN models by fine-grained MACs allocation for the width, depth and resolution on the stage level by using a dynamic programming manner and consistently outperforms the performance of the original scaling method.

Network Amplification with Efficient MACs Allocation

This paper proposes to enlarge the capacity of CNN models byained MACs allocation for the width, depth and resolution on the stage level by using a dynamic programming manner and achieves state-of-the-art accuracies.

Scaling Down: Efficient Inference for Convolutional Neural Networks

This paper explores several efficient architectures that satisfy a baseline accuracy on an image recognition task and trains a NasNet-A convolutional neural network to an accuracy of 0.8034 that has 5.2M parameters, 662M multiplication operations, and 659M addition operations.

Revisiting ResNets: Improved Training and Scaling Strategies

It is found that training and scaling strategies may matter more than architectural changes, and further, that the resulting ResNets match recent state-of-the-art models.

ThreshNet: An Efficient DenseNet Using Threshold Mechanism to Reduce Connections

This work proposes a new network architecture, ThreshNet, using a threshold mechanism to further optimize the connection method, and shows that, compared with HarDNet68, GhostNet, MobileNetV2, ShuffleNet, and EfficientNet, the inference time of the proposed Thresh net79 is 5%, 9%, 10%, 18%, and 20% faster, respectively.

UPANets: Learning from the Universal Pixel Attention Neworks

This work proposes an efficient but robust backbone, which equips with channel and spatial direction attentions, so the attentions help to expand receptive fields in shallow convolutional layers and pass the information to every layer.

Depth-Wise Neural Architecture Search

This work proposes a NAS approach to efficiently design accurate and low-cost convolutional architectures and demonstrates that an efficient strategy for designing these architectures is to learn the depth stage-by-stage, such that stages with low importance are kept shallow while stages with high importance become deeper.



Rethinking the Inception Architecture for Computer Vision

This work is exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization.

PolyNet: A Pursuit of Structural Diversity in Very Deep Networks

This work presents a new family of modules, namely the PolyInception, which can be flexibly inserted in isolation or in a composition as replacements of different parts of a network, and demonstrates substantial improvements over the state-of-the-art on the ILSVRC 2012 benchmark.

MnasNet: Platform-Aware Neural Architecture Search for Mobile

An automated mobile neural architecture search (MNAS) approach, which explicitly incorporate model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency.

Aggregated Residual Transformations for Deep Neural Networks

On the ImageNet-1K dataset, it is empirically show that even under the restricted condition of maintaining complexity, increasing cardinality is able to improve classification accuracy and is more effective than going deeper or wider when the authors increase the capacity.

Progressive Neural Architecture Search

We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary

Going deeper with convolutions

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition

Deep Networks with Stochastic Depth

Stochastic depth is proposed, a training procedure that enables the seemingly contradictory setup to train short networks and use deep networks at test time and reduces training time substantially and improves the test error significantly on almost all data sets that were used for evaluation.

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

This work proposes to evaluate the direct metric on the target platform, beyond only considering FLOPs, and derives several practical guidelines for efficient network design, called ShuffleNet V2.

On the Expressive Power of Overlapping Architectures of Deep Learning

This work theoretically analyzes the effect of "overlaps" in the convolutional process, and shows that having overlapping local receptive fields, and more broadly denser connectivity, results in an exponential increase in the expressive capacity of neural networks.

Learning Transferable Architectures for Scalable Image Recognition

This paper proposes to search for an architectural building block on a small dataset and then transfer the block to a larger dataset and introduces a new regularization technique called ScheduledDropPath that significantly improves generalization in the NASNet models.