β-DARTS: Beta-Decay Regularization for Differentiable Architecture Search

@article{Ye2022DARTSBR,
  title={$\beta$-DARTS: Beta-Decay Regularization for Differentiable Architecture Search},
  author={Peng Ye and Baopu Li and Yikang Li and Tao Chen and Jiayuan Fan and Wanli Ouyang},
  journal={ArXiv},
  year={2022},
  volume={abs/2203.01665}
}
Neural Architecture Search (NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural network automatically. Among them, differential NAS approaches such as DARTS, have gained popularity for the search efficiency. However, they suffer from two main issues, the weak robustness to the performance collapse and the poor generalization ability of the searched architectures. To solve these two problems, a simple-but-efficient regularization method… 

Figures and Tables from this paper

Generalization Properties of NAS under Activation and Skip Connection Search

TLDR
This work derives the lower (and upper) bounds of the minimum eigenvalue of Neural Tangent Kernel under the (in)finite width regime from a search space including mixed activation functions, fully connected, and residual neural networks, and leverages the eigen Value bounds to establish generalization error bounds of NAS in the stochastic gradient descent training.

MRF-UNets: Searching UNet with Markov Random Fields

TLDR
This work proposes Markov Random Field Neural Architecture Search (MRF-NAS) that extends and improves the recent Adaptive and Optimal Network Width Search (AOWS) method with a more general MRF framework and identifies the sub-optimality of the original UNet architecture.

References

SHOWING 1-10 OF 33 REFERENCES

NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search

TLDR
A general framework for one-shot NAS that can be instantiated to many recently-introduced variants and a general benchmarking framework that draws on the recent large-scale tabular benchmark NAS-Bench-101 for cheap anytime evaluations of one- shot NAS methods are introduced.

Adapting Neural Architectures Between Domains

TLDR
The theoretical analyses lead to AdaptNAS, a novel and principled approach to adapt neural architectures between domains in NAS, which shows that only a small part of ImageNet will be sufficient for AdaptNAS to extend its architecture success to the entire ImageNet and outperform state-of the-art comparison algorithms.

DARTS-: Robustly Stepping out of Performance Collapse Without Indicators

TLDR
It is conjecture that skip connections profit too much from this privilege, hence causing the collapse for the derived model, and it is proposed to factor out this benefit with an auxiliary skip connection, ensuring a fairer competition for all operations.

DARTS: Differentiable Architecture Search

TLDR
The proposed algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques.

Stabilizing Differentiable Architecture Search via Perturbation-based Regularization

TLDR
This work finds that the precipitous validation loss landscape, which leads to a dramatic performance drop when distilling the final architecture, is an essential factor that causes instability and proposes a perturbation-based regularization - SmoothDARTS (SDARTS), to smooth the loss landscape and improve the generalizability of DARTS-based methods.

NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search

TLDR
This work proposes an extension to NAS-bench-101: NAS-Bench-201 with a different search space, results on multiple datasets, and more diagnostic information, which provides additional diagnostic information such as fine-grained loss and accuracy, which can give inspirations to new designs of NAS algorithms.

Rethinking Architecture Selection in Differentiable NAS

TLDR
This work proposes an alternative perturbation-based architecture selection that directly measures each operation’s influence on the supernet and finds that it is able to extract significantly improved architectures from the underlying supernets consistently and re-evaluate several differentiable NAS methods with the proposed architecture selection.

iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients

TLDR
This paper formulate a stochastic hypergradient approximation for differentiable NAS, and theoretically show that the architecture optimization with the proposed method, named iDARTS, is expected to converge to a stationary point.

MixSearch: Searching for Domain Generalized Medical Image Segmentation Architectures

TLDR
Extensive experiments show that the architectures automatically learned by the proposed MixSearch surpass U-Net and its variants by a significant margin, verifying the generalization ability and practicability of the proposed method.

Sharpness-Aware Minimization for Efficiently Improving Generalization

TLDR
This work introduces a novel, effective procedure for simultaneously minimizing loss value and loss sharpness, Sharpness-Aware Minimization (SAM), which improves model generalization across a variety of benchmark datasets and models, yielding novel state-of-the-art performance for several.