• Corpus ID: 233004338

How Powerful are Performance Predictors in Neural Architecture Search?

@article{White2021HowPA,
  title={How Powerful are Performance Predictors in Neural Architecture Search?},
  author={Colin White and Arber Zela and Binxin Ru and Yang Liu and Frank Hutter},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.01177}
}
Early methods in the rapidly developing field of neural architecture search (NAS) required fully training thousands of neural networks. To reduce this extreme computational cost, dozens of techniques have since been proposed to predict the final performance of neural architectures. Despite the success of such performance prediction methods, it is not well-understood how different families of techniques compare to one another, due to the lack of an agreed-upon evaluation metric and optimization… 
EmProx: Neural Network Performance Estimation For Neural Architecture Search
TLDR
Performance estimations of this method are comparable to the MLP performance predictor used in NAO in terms of accuracy, while being nearly nine times faster to train compared to NAO.
BINAS: Bilinear Interpretable Neural Architecture Search
TLDR
A Bilinear Interpretable approach for constrained Neural Architecture Search (BINAS) that is based on an accurate and simple bilinear formulation of both an accuracy estimator and the expected resource requirement, jointly with a scalable search method with theoretical guarantees is introduced.
Surrogate NAS Benchmarks: Going Beyond the Limited Search Spaces of Tabular NAS Benchmarks
TLDR
It is shown that surrogate NAS benchmarks can model the true performance of architectures better than tabular benchmarks (at a small fraction of the cost), that they lead to faithful estimates of how well different NAS methods work on the original non-surrogate benchmark, and that they can generate new scientific insight.
What to expect of hardware metric predictors in NAS
TLDR
It is shown that simply verifying the predictions of just the selected architectures can lead to substantially improved results, and under a time budget, it is preferable to use a fast and inaccurate prediction model over accurate but slow live measurements.
VALUATION IS ( N OW ) S URPRISINGLY E ASY
TLDR
An in-depth analysis of popular NAS algorithms and performance prediction methods across 25 different combinations of search spaces and datasets is presented, finding that many conclusions drawn from a few NAS benchmarks do not generalize to other benchmarks.
IQNAS: Interpretable Integer Quadratic Programming Neural Architecture Search
TLDR
This work introduces Interpretable Integer Quadratic programming Neural Architecture Search (IQNAS), that is based on an accurate and simple quadratic formulation of both the accuracy predictor and the expected resource requirement, together with a scalable search method with theoretical guarantees.
Mutation is all you need
TLDR
Experimental results are presented suggesting that the performance of BANANAS on the NAS-Bench-301 benchmark is determined by its acquisition function optimizer, which minimally mutates the incumbent.
S URROGATE NAS B ENCHMARKS : G OING B EYOND THE L IMITED S EARCH S PACES OF T ABULAR NAS B ENCHMARKS
TLDR
It is shown that surrogate NAS benchmarks can model the true performance of architectures better than tabular benchmarks (at a small fraction of the cost), that they lead to faithful estimates of how well different NAS methods work on the original non-surrogate benchmark, and that they can generate new scientific insight.
Automated machine learning for borehole resistivity measurements
TLDR
This work proposes a scoring function that accounts for the accuracy and size of the DNNs compared to a reference DNN that provides a good approximation for the operators and uses DNN architecture search algorithms to obtain a quasi-optimal DNN smaller than the reference network.
Efficient guided evolution for neural architecture search
TLDR
G-EA guides the evolution by exploring the search space by generating and evaluating several architectures in each generation at initialisation stage using a zero-proxy estimator, where only the highest-scoring architecture is trained and kept for the next generation.
...
...

References

SHOWING 1-10 OF 85 REFERENCES
NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing
TLDR
This work step outside the computer vision domain by leveraging the language modeling task, which is the core of natural language processing (NLP), and considers that the benchmark will provide more reliable empirical findings in the community and stimulate progress in developing new NAS methods well suited for recurrent architectures.
How to Train Your Super-Net: An Analysis of Training Heuristics in Weight-Sharing NAS
TLDR
This analysis uncovers that some commonly-used heuristics for super-net training negatively impact the correlation between super-nets and stand-alone performance, and evidences the strong influence of certain hyperparameters and architectural choices.
DARTS: Differentiable Architecture Search
TLDR
The proposed algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques.
BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search
TLDR
This work develops a BO procedure that leverages a novel architecture representation and a neural network-based predictive uncertainty model on this representation and achieves state-of-the-art performance on the NASBench dataset and is over 100x more efficient than random search.
Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS
TLDR
This work proposes BONAS (Bayesian Optimized Neural Architecture Search), a sample-based NAS framework which is accelerated using weight-sharing to evaluate multiple related architectures simultaneously, and accelerates the traditional sample- based approach significantly, but also keeps its reliability.
Pruning neural networks without any data by iteratively conserving synaptic flow
TLDR
The data-agnostic pruning algorithm challenges the existing paradigm that, at initialization, data must be used to quantify which synapses are important, and consistently competes with or outperforms existing state-of-the-art pruning algorithms at initialization over a range of models, datasets, and sparsity constraints.
Semi-Supervised Neural Architecture Search
TLDR
This paper proposes SemiNAS, a semi-supervised NAS approach that leverages numerous unlabeled architectures (without evaluation and thus nearly no cost) to improve the controller and achieves higher accuracy under the same computational cost.
Accelerating Neural Architecture Search using Performance Prediction
TLDR
Standard frequentist regression models can predict the final performance of partially trained model configurations using features based on network architectures, hyperparameters, and time-series validation performance data and an early stopping method is proposed, which obtains a speedup of a factor up to 6x in both hyperparameter optimization and meta-modeling.
Learning Multiple Layers of Features from Tiny Images
TLDR
It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.
Nasbench-101: Towards reproducible neural architecture search
  • arXiv preprint arXiv:1902.09635,
  • 1902
...
...