BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models

@inproceedings{Yu2020BigNASSU,
  title={BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models},
  author={Jiahui Yu and Pengchong Jin and Hanxiao Liu and Gabriel Bender and Pieter-Jan Kindermans and Mingxing Tan and Thomas Huang and Xiaodan Song and Quoc V. Le},
  booktitle={ECCV},
  year={2020}
}
Neural architecture search (NAS) has shown promising results discovering models that are both accurate and fast. For NAS, training a one-shot model has become a popular strategy to rank the relative quality of different architectures (child models) using a single set of shared weights. However, while one-shot model weights can effectively rank different network architectures, the absolute accuracies from these shared weights are typically far below those obtained from stand-alone training. To… 
How Does Supernet Help in Neural Architecture Search?
TLDR
A comprehensive analysis on five search spaces, including NAS- Bench-101, NAS-Bench-201, DARTS-CIFAR10, DARts-PTB, and ProxylessNAS, finds a well-trained supernet is not necessarily a good architecture-ranking model and it is easier to find better architectures from an effectively pruned search space with supernet training.
One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking
TLDR
This paper proposes a one-shot neural ensemble architecture search (NEAS) solution that addresses the two challenges of enlarged search space and potentially more complexity for the searched model, and introduces a novel diversity-based metric to guide search space shrinking.
NetAdaptV2: Efficient Neural Architecture Search with Fast Super-Network Training and Architecture Optimization
TLDR
This paper proposes channel-level bypass connections that merge network depth and layer width into a single search dimension to reduce the time for training and evaluating sampled deep neural networks, and proposes the multi-layer coordinate descent optimizer that considers the interplay of multiple layers in each iteration of optimization to improve the performance of discovered DNNs while supporting non-differentiable search metrics.
AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models
TLDR
This paper carefully design the techniques of one-shot learning and the search space to provide an adaptive and efficient development way of tiny PLMs for various latency constraints and proposes a more efficient development method that is even faster than the development of a single PLM.
Wisdom of Committees: An Overlooked Approach To Faster and More Accurate Models
TLDR
It is found that even the most simplistic method for building committees from existing, independently trained networks can match or exceed the accuracy of stateof-the-art models while being drastically more efficient.
FBNetV5: Neural Architecture Search for Multiple Tasks in One Run
TLDR
FBNetV5 is proposed, a NAS framework that can search for neural architectures for a variety of vision tasks with much reduced computational cost and human effort and outperformed the previous stateof-the-art in all the three tasks.
Enabling NAS with Automated Super-Network Generation
TLDR
BootstrapNAS is presented, a software framework for automatic generation of super-networks for NAS that takes a pre-trained model from a popular architecture and automatically creates a super- network out of it, then uses state-of-the-art NAS techniques to train the super-network, resulting in subnetworks that significantly outperform the given pre- trained model.
Parameter Prediction for Unseen Deep Architectures
TLDR
This work proposes a hypernetwork that can predict performant parameters in a single forward pass taking a fraction of a second, even on a CPU, and learns a strong representation of neural architectures enabling their analysis.
Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective
TLDR
This work proposes a novel framework called training-free neural architecture search (TE-NAS), which ranks architectures by analyzing the spectrum of the neural tangent kernel (NTK) and the number of linear regions in the input space and shows that these two measurements imply the trainability and expressivity of a neural network.
Understanding and Accelerating Neural Architecture Search with Training-Free and Theory-Grounded Metrics
TLDR
This work presents a unified framework to understand and accelerate NAS, by disentangling “TEG” characteristics of searched networks – Trainability, Expressivity, Generalization – all assessed in a training-free manner, leading to both improved search accuracy and over 2.3× reduction in search time cost.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 56 REFERENCES
SMASH: One-Shot Model Architecture Search through HyperNetworks
TLDR
A technique to accelerate architecture selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model's architecture is proposed, achieving competitive performance with similarly-sized hand-designed networks.
Single Path One-Shot Neural Architecture Search with Uniform Sampling
TLDR
A Single Path One-Shot model is proposed to construct a simplified supernet, where all architectures are single paths so that weight co-adaption problem is alleviated.
DSNAS: Direct Neural Architecture Search Without Parameter Retraining
  • Shou-Yong Hu, Sirui Xie, +4 authors Dahua Lin
  • Computer Science, Mathematics
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
TLDR
DSNAS is proposed, an efficient differentiable NAS framework that simultaneously optimizes architecture and parameters with a low-biased Monte Carlo estimate and successfully discovers networks with comparable accuracy on ImageNet.
AutoSlim: Towards One-Shot Architecture Search for Channel Numbers
TLDR
A simple and one-shot solution to set channel numbers in a neural network to achieve better accuracy under constrained resources (e.g., FLOPs, latency, memory footprint or model size) is presented.
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
TLDR
A new scaling method is proposed that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient and is demonstrated the effectiveness of this method on scaling up MobileNets and ResNet.
Neural Architecture Search with Reinforcement Learning
TLDR
This paper uses a recurrent network to generate the model descriptions of neural networks and trains this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set.
Efficient Neural Architecture Search via Parameter Sharing
TLDR
Efficient Neural Architecture Search is a fast and inexpensive approach for automatic model design that establishes a new state-of-the-art among all methods without post-training processing and delivers strong empirical performances using much fewer GPU-hours.
AOWS: Adaptive and Optimal Network Width Search With Latency Constraints
TLDR
This work introduces a novel efficient one-shot NAS approach to optimally search for channel numbers, given latency constraints on a specific hardware, and proposes an adaptive channel configuration sampling scheme to gradually specialize the training phase to the target computational constraints.
Understanding and Simplifying One-Shot Architecture Search
TLDR
With careful experimental analysis, it is shown that it is possible to efficiently identify promising architectures from a complex search space without either hypernetworks or reinforcement learning controllers.
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
TLDR
ProxylessNAS is presented, which can directly learn the architectures for large-scale target tasks and target hardware platforms and apply ProxylessNAS to specialize neural architectures for hardware with direct hardware metrics (e.g. latency) and provide insights for efficient CNN architecture design.
...
1
2
3
4
5
...