Single Path One-Shot Neural Architecture Search with Uniform Sampling

@inproceedings{Guo2020SinglePO,
  title={Single Path One-Shot Neural Architecture Search with Uniform Sampling},
  author={Zichao Guo and Xiangyu Zhang and Haoyuan Mu and Wen Heng and Zechun Liu and Yichen Wei and Jian Sun},
  booktitle={ECCV},
  year={2020}
}
One-shot method is a powerful Neural Architecture Search (NAS) framework, but its training is non-trivial and it is difficult to achieve competitive results on large scale datasets like ImageNet. [...] Key Method Once we have a trained supernet, we apply an evolutionary algorithm to efficiently search the best-performing architectures without any fine tuning. Comprehensive experiments verify that our approach is flexible and effective. It is easy to train and fast to search. It effortlessly supports complex…Expand
MixPath: A Unified Approach for One-shot Neural Architecture Search
TLDR
This paper discovers that in the studied search space, feature vectors summed from multiple paths are nearly multiples of those from a single path, which perturbs supernet training and its ranking ability, and proposes a novel mechanism called Shadow Batch Normalization (SBN) to regularize the disparate feature statistics. Expand
One-Shot Neural Architecture Search via Novelty Driven Sampling
TLDR
A new approach is presented, Efficient Novelty-driven Neural Architecture Search, to sample the most abnormal architecture to train the supernet, and only the weights of a single architecture sampled by the novelty search are optimized in each step to reduce the memory demand greatly. Expand
Balanced One-shot Neural Architecture Optimization
TLDR
Balanced NAO is proposed where balanced training of the supernet is introduced during the search procedure to encourage more updates for large architectures than small architectures by sampling architectures in proportion to their model sizes. Expand
Improving One-Shot NAS by Suppressing the Posterior Fading
  • Xiang Li, C. Lin, +4 authors Wanli Ouyang
  • Computer Science
  • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
TLDR
This paper analyzes existing weight sharing one-shot NAS approaches from a Bayesian point of view and identifies the Posterior Fading problem, which compromises the effectiveness of shared weights, and presents a novel approach to guide the parameter posterior towards its true distribution. Expand
One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking
TLDR
This paper proposes a one-shot neural ensemble architecture search (NEAS) solution that addresses the two challenges of enlarged search space and potentially more complexity for the searched model, and introduces a novel diversity-based metric to guide search space shrinking. Expand
Powering One-shot Topological NAS with Stabilized Share-parameter Proxy
TLDR
The difficulties for architecture searching in such a complex space has been eliminated by the proposed stabilized share-parameter proxy, which employs Stochastic Gradient Langevin Dynamics to enable fast shared parameter sampling, so as to achieve stabilized measurement of architecture performance even in search space with complex topological structures. Expand
GreedyNAS: Towards Fast One-Shot NAS With Greedy Supernet
TLDR
This paper proposes a multi-path sampling strategy with rejection, and greedily filter the weak paths to ease the burden of supernet by encouraging it to focus more on evaluation of those potentially-good ones, which are identified using a surrogate portion of validation data. Expand
Understanding and Improving One-shot Neural Architecture Optimization
TLDR
This work empirically investigates several main factors that lead to the gaps and so weak ranking correlation between architectures under one-shot training and the ones under stand-alone complete training and proposes NAO-V2 to alleviate such gaps. Expand
PONAS: Progressive One-shot Neural Architecture Search for Very Efficient Deployment
  • Sian-Yao Huang, W. Chu
  • Computer Science
  • 2021 International Joint Conference on Neural Networks (IJCNN)
  • 2021
TLDR
In PONAS, a two-stage training scheme is proposed that combines advantages of progressive NAS and one-shot methods, including the meta training stage and the fine-tuning stage, to make the search process efficient and stable. Expand
BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models
TLDR
The proposed BigNAS, an approach that challenges the conventional wisdom that post-processing of the weights is necessary to get good prediction accuracies, is proposed, able to train a single set of shared weights on ImageNet and use these weights to obtain child models whose sizes range from 200 to 1000 MFLOPs. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 49 REFERENCES
You Only Search Once: Single Shot Neural Architecture Search via Direct Sparse Optimization
TLDR
The motivation behind DSO-NAS is to address the task in the view of model pruning, and it enjoys both advantages of differentiability and efficiency, therefore it can be directly applied to large datasets like ImageNet and tasks beyond classification. Expand
Single-Path NAS: Designing Hardware-Efficient ConvNets in less than 4 Hours
TLDR
This work proposes Single-Path NAS, a novel differentiable NAS method for designing hardware-efficient ConvNets in less than 4 hours, and uses one single-path over-parameterized ConvNet to encode all architectural decisions with shared convolutional kernel parameters, hence drastically decreasing the number of trainable parameters and the search cost down to few epochs. Expand
BlockQNN: Efficient Block-Wise Neural Network Architecture Generation
TLDR
This paper provides a block-wise network generation pipeline called BlockQNN which automatically builds high-performance networks using the Q-Learning paradigm with epsilon-greedy exploration strategy and proposes a distributed asynchronous framework and an early stop strategy. Expand
Practical Block-Wise Neural Network Architecture Generation
TLDR
A block-wise network generation pipeline called BlockQNN which automatically builds high-performance networks using the Q-Learning paradigm with epsilon-greedy exploration strategy and offers tremendous reduction of the search space in designing networks which only spends 3 days with 32 GPUs. Expand
Learning Transferable Architectures for Scalable Image Recognition
TLDR
This paper proposes to search for an architectural building block on a small dataset and then transfer the block to a larger dataset and introduces a new regularization technique called ScheduledDropPath that significantly improves generalization in the NASNet models. Expand
SMASH: One-Shot Model Architecture Search through HyperNetworks
TLDR
A technique to accelerate architecture selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model's architecture is proposed, achieving competitive performance with similarly-sized hand-designed networks. Expand
Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search
TLDR
A novel differentiable neural architecture search (DNAS) framework is proposed to efficiently explore its exponential search space with gradient-based optimization and surpass the state-of-the-art compression of ResNet on CIFAR-10 and ImageNet. Expand
MnasNet: Platform-Aware Neural Architecture Search for Mobile
TLDR
An automated mobile neural architecture search (MNAS) approach, which explicitly incorporate model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency. Expand
Efficient Neural Architecture Search via Proximal Iterations
TLDR
Different from DARTS, NASP reformulates the search process as an optimization problem with a constraint that only one operation is allowed to be updated during forward and backward propagation, and proposes a new algorithm inspired by proximal iterations to solve it. Expand
Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks
TLDR
A novel family of models called Budgeted Super Networks (BSN) is proposed, learned using gradient descent techniques applied on a budgeted learning objective function which integrates a maximum authorized cost, while making no assumption on the nature of this cost. Expand
...
1
2
3
4
5
...