• Corpus ID: 3638969

Efficient Neural Architecture Search via Parameter Sharing

  title={Efficient Neural Architecture Search via Parameter Sharing},
  author={Hieu Pham and Melody Y. Guan and Barret Zoph and Quoc V. Le and Jeff Dean},
We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. In ENAS, a controller learns to discover neural network architectures by searching for an optimal subgraph within a large computational graph. The controller is trained with policy gradient to select a subgraph that maximizes the expected reward on the validation set. Meanwhile the model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss… 

Figures and Tables from this paper

Distribution Consistent Neural Architecture Search

This paper theoretically investigates how the weight coupling problem affects the network searching performance from a parameter distribution perspective, and proposes a novel supernet training strategy with a Distribution Consistent Constraint that can provide a good measurement for the extent to which two architectures can share weights.

Neural Architecture Generator Optimization

This work is the first to investigate casting NAS as a problem of finding the optimal network generator and proposes a new, hierarchical and graph-based search space capable of representing an extremely large variety of network types, yet only requiring few continuous hyper-parameters.

Task-Aware Performance Prediction for Efficient Architecture Search

This work proposes a novel gradient-based framework for efficient architecture search by sharing information across several tasks by adopting a continuous parametrization of the model architecture, which allows for efficient gradient- based optimization.

A Novel Training Protocol for Performance Predictors of Evolutionary Neural Architecture Search Algorithms

A new training protocol is proposed, consisting of designing a pairwise ranking indicator to construct the training target, proposing to use the logistic regression to fit the training samples, and developing a differential method to build the training instances that can significantly improve the performance prediction accuracy against the traditional training protocols.

Sample-Efficient Neural Architecture Search by Learning Actions for Monte Carlo Tree Search

Empirical results demonstrate that LaNAS is at least an order more sample efficient than baseline methods including evolutionary algorithms, Bayesian optimizations, and random search.

FENAS: Flexible and Expressive Neural Architecture Search

This work proposes a novel architecture search algorithm called Flexible and Expressible Neural Architecture Search (FENAS), with more flexible and expressible search space than ENAS, in terms of more activation functions, input edges, and atomic operations.

Recurrent Neural Architecture Search based on Randomness-Enhanced Tabu Algorithm

This paper applies the randomness-enhanced tabu algorithm as a controller to sample candidate architectures, which balances the global exploration and local exploitation for the architectural solutions, and discovers the recurrent neural architecture within 0.78 GPU hour.

Improving the Efficient Neural Architecture Search via Rewarding Modifications

Improved-ENAS is proposed, a further improvement of ENAS that augments the reinforcement learning training method by modifying the reward of each tested architecture according to the results obtained in previously tested architectures.

AdvantageNAS: Efficient Neural Architecture Search with Credit Assignment

A novel search strategy for one-shot and sparse propagation NAS, namely AdvantageNAS, is proposed, which further reduces the time complexity of NAS by reducing the number of search iterations and monotonically improves the expected loss and thus converges.

Efficient Neural Architecture Search with Network Morphism

A novel framework enabling Bayesian optimization to guide the network morphism for efficient neural architecture search by introducing a neural network kernel and a tree-structured acquisition function optimization algorithm.



Efficient Architecture Search by Network Transformation

This paper proposes a new framework toward efficient architecture search by exploring the architecture space based on the current network and reusing its weights, and employs a reinforcement learning agent as the meta-controller, whose action is to grow the network depth or layer width with function-preserving transformations.

Neural Architecture Search with Reinforcement Learning

This paper uses a recurrent network to generate the model descriptions of neural networks and trains this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set.

Accelerating Neural Architecture Search using Performance Prediction

Standard frequentist regression models can predict the final performance of partially trained model configurations using features based on network architectures, hyperparameters, and time-series validation performance data and an early stopping method is proposed, which obtains a speedup of a factor up to 6x in both hyperparameter optimization and meta-modeling.

Hierarchical Representations for Efficient Architecture Search

This work efficiently discovers architectures that outperform a large number of manually designed models for image classification, obtaining top-1 error of 3.6% on CIFAR-10 and 20.3% when transferred to ImageNet, which is competitive with the best existing neural architecture search approaches.

Designing Neural Network Architectures using Reinforcement Learning

MetaQNN is introduced, a meta-modeling algorithm based on reinforcement learning to automatically generate high-performing CNN architectures for a given learning task that beat existing networks designed with the same layer types and are competitive against the state-of-the-art methods that use more complex layer types.

Neural Optimizer Search with Reinforcement Learning

An approach to automate the process of discovering optimization methods, with a focus on deep learning architectures, and introduces two new optimizers, named PowerSign and AddSign, which show transfer well and improve training on a variety of different tasks and architectures.

Practical Network Blocks Design with Q-Learning

This work provides a solution to automatically and efficiently design high performance network architectures by focusing on constructing network blocks, which can be stacked to generate the whole network.

Peephole: Predicting Network Performance Before Training

A unified way to encode individual layers into vectors and bring them together to form an integrated description via LSTM, taking advantage of the recurrent network's strong expressive power, can reliably predict the performances of various network architectures.

Capacity and Trainability in Recurrent Neural Networks

It is found that for several tasks it is the per-task parameter capacity bound that determines performance, and two novel RNN architectures are proposed, one of which is easier to train than the LSTM or GRU for deeply stacked architectures.

Regularizing and Optimizing LSTM Language Models

This paper proposes the weight-dropped LSTM which uses DropConnect on hidden-to-hidden weights as a form of recurrent regularization and introduces NT-ASGD, a variant of the averaged stochastic gradient method, wherein the averaging trigger is determined using a non-monotonic condition as opposed to being tuned by the user.