• Corpus ID: 54438210

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

@article{Cai2018ProxylessNASDN,
title={ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware},
author={Han Cai and Ligeng Zhu and Song Han},
journal={ArXiv},
year={2018},
volume={abs/1812.00332}
}
• Published 27 September 2018
• Computer Science
• ArXiv
Neural architecture search (NAS) has a great impact by automatically designing effective neural network architectures. However, the prohibitive computational demand of conventional NAS algorithms (e.g. $10^4$ GPU hours) makes it difficult to \emph{directly} search the architectures on large-scale tasks (e.g. ImageNet). Differentiable NAS can reduce the cost of GPU hours via a continuous representation of network architecture but suffers from the high GPU memory consumption issue (grow linearly…
1,323 Citations

Figures and Tables from this paper

• Computer Science
ECCV
• 2020
DA-NAS is presented that can directly search the architecture for large-scale target tasks while allowing a large candidate set in a more efficient manner, and supports an argument search space to efficiently search the best-performing architecture.
• Computer Science
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
• 2020
The proposed MemNAS is a novel growing and trimming based neural architecture search framework that optimizes not only performance but also memory requirement of an inference network and considers running memory use as an optimization objective along with performance.
• Computer Science
AAAI
• 2021
This work proposes to analyze the architecture transferability of different NAS methods by performing a series of experiments on large scale benchmarks such as ImageNet1K and ImageNet22K and finds that even on large datasets, random sampling baseline is very competitive, but the choice of the appropriate combination of proxy set and search strategy can provide significant improvement over it.
• Computer Science
• 2021
This work proposes a novel framework called training-free neural architecture search (TE-NAS), which ranks architectures by analyzing the spectrum of the neural tangent kernel (NTK) and the number of linear regions in the input space and shows that these two measurements imply the trainability and expressivity of a neural network.
• Yanxi LiChang Xu
• Computer Science
NeurIPS
• 2020
The theoretical analyses lead to AdaptNAS, a novel and principled approach to adapt neural architectures between domains in NAS, which shows that only a small part of ImageNet will be sufficient for AdaptNAS to extend its architecture success to the entire ImageNet and outperform state-of the-art comparison algorithms.
This work proposes a generally applicable framework that introduces only minor changes to existing optimizers to leverage this feature of Neural Architecture Search and demonstrates that the proposed framework generally gives better results and, in the worst case, is just as good as the unmodified optimizer.
• Computer Science
IEEE Transactions on Pattern Analysis and Machine Intelligence
• 2021
Experimental evaluation indicates that, across diverse image classification tasks and computational objectives, NAT is an appreciably more effective alternative to conventional transfer learning of fine-tuning weights of an existing network architecture learned on standard datasets.
This thesis trains the over-parameterized network for only one epoch before update network architecture, and can be transferred directly to convolutional neural networks compression by enforcing structural sparsity that is able to achieve extremely sparse networks without accuracy deterioration.
• Computer Science
ArXiv
• 2019
A new approach for NAS is proposed, called NASIB, which adapts and attunes to the computation resources available by varying the exploration vs. exploitation trade-off, which could lead to novel architectures that require lesser domain expertise, compared to the majority of the existing methods.
• Computer Science
ArXiv
• 2020
This paper presents a fast NPU-aware NAS methodology, called S3NAS, to find a CNN architecture with higher accuracy than the existing ones under a given latency constraint, and applies a modified Single-Path NAS technique to the proposed supernet structure.

References

SHOWING 1-10 OF 38 REFERENCES

• Computer Science
AAAI
• 2018
This paper proposes a new framework toward efficient architecture search by exploring the architecture space based on the current network and reusing its weights, and employs a reinforcement learning agent as the meta-controller, whose action is to grow the network depth or layer width with function-preserving transformations.
• Computer Science
ICML
• 2018
Efficient Neural Architecture Search is a fast and inexpensive approach for automatic model design that establishes a new state-of-the-art among all methods without post-training processing and delivers strong empirical performances using much fewer GPU-hours.
This work proposes to evaluate the direct metric on the target platform, beyond only considering FLOPs, and derives several practical guidelines for efficient network design, called ShuffleNet V2.
• Mingxing TanBo Chen
• Computer Science
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
• 2019
An automated mobile neural architecture search (MNAS) approach, which explicitly incorporate model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency.
• Computer Science
ArXiv
• 2018
This work proposes NASH, an architecture search which considerable reduces the computational resources required for training novel architectures by applying network morphisms and aggressive learning rate schedules and proposes Pareto-NASH, a method for multi-objective architecture search that allows approximating the Pare to-front of architectures under multiple objective, such as predictive performance and number of parameters, in a single run of the method.
• Computer Science
ICLR
• 2018
Surprisingly, this simple method to automatically search for well-performing CNN architectures based on a simple hill climbing procedure whose operators apply network morphisms, followed by short optimization runs by cosine annealing yields competitive results.
• Computer Science
ECCV
• 2018
DPP-Net is proposed: Device-aware Progressive Search for Pareto-optimal Neural Architectures, optimizing for both device-related and device-agnostic objectives, which achieves better performances: higher accuracy & shorter inference time on various devices.
• Computer Science
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
• 2018
A block-wise network generation pipeline called BlockQNN which automatically builds high-performance networks using the Q-Learning paradigm with epsilon-greedy exploration strategy and offers tremendous reduction of the search space in designing networks which only spends 3 days with 32 GPUs.
• Computer Science
NIPS
• 2015
BinaryConnect is introduced, a method which consists in training a DNN with binary weights during the forward and backward propagations, while retaining precision of the stored weights in which gradients are accumulated, and near state-of-the-art results with BinaryConnect are obtained on the permutation-invariant MNIST, CIFAR-10 and SVHN.
• Computer Science
2017 IEEE International Conference on Computer Vision (ICCV)
• 2017
The approach is called network slimming, which takes wide and large networks as input models, but during training insignificant channels are automatically identified and pruned afterwards, yielding thin and compact models with comparable accuracy.