• Corpus ID: 80628408

Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly

@article{Kandasamy2019TuningHW,
  title={Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly},
  author={Kirthevasan Kandasamy and Karun Raju Vysyaraju and Willie Neiswanger and Biswajit Paria and Christopher R. Collins and Jeff G. Schneider and Barnab{\'a}s P{\'o}czos and Eric P. Xing},
  journal={ArXiv},
  year={2019},
  volume={abs/1903.06694}
}
Bayesian Optimisation (BO) refers to a suite of techniques for global optimisation of expensive black box functions, which use introspective Bayesian models of the function to efficiently search for the optimum. While BO has been applied successfully in many applications, modern optimisation tasks usher in new challenges where conventional methods fail spectacularly. In this work, we present Dragonfly, an open source Python library for scalable and robust BO. Dragonfly incorporates multiple… 

Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning

This work introduces a method to automatically design the BO search space by relying on evaluations of previous black-box functions, which depart from the common practice of defining a set of arbitrary search ranges a priori by considering search space geometries that are learnt from historical data.

Computationally Efficient High-Dimensional Bayesian Optimization via Variable Selection

This work develops a new computationally efficient high-dimensional BO method that exploits variable selection and is able to automatically learn axis-aligned sub-spaces, i.e. spaces containing selected variables, without the demand of any pre-specified hyperparameters.

Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges

This work gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with machine learning pipelines, runtime improvements, and parallelization.

HEBO Pushing The Limits of Sample-Efficient Hyperparameter Optimisation

It is observed that HEBO significantly outperforms existing black-box optimisers on 108 machine learning hyperparameter tuning tasks comprising the Bayesmark benchmark.

Scalable First-Order Bayesian Optimization via Structured Automatic Differentiation

This work observes that a wide range of kernels gives rise to structured matrices, enabling an exact O ( n 2 d ) matrix-vector multiply for gradient observations and O ( 2 d 2 ) for Hessian observations, and derives a programmatic approach to leveraging this type of structure for transformations and combinations of the discussed kernel classes.

Amortized Bayesian Optimization over Discrete Spaces

On several challenging discrete design problems, this method generally outperforms other methods at optimizing the inner acquisition function, resulting in more efficient optimization of the outer black-box objective.

SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization

SMAC3 offers a robust and flexible framework for Bayesian Optimization, which can improve performance within a few evaluations, and offers several facades and pre-sets for typical use cases, such as optimizing hyperparameters.

DEHB: Evolutionary Hyberband for Scalable, Robust and Efficient Hyperparameter Optimization

Comprehensive results on a very broad range of HPO problems, as well as a wide range of tabular benchmarks from neural architecture search, demonstrate that DEHB achieves strong performance far more robustly than all previous HPO methods, especially for high-dimensional problems with discrete input dimensions.

Towards Automated Design of Bayesian Optimization via Exploratory Landscape Analysis

This work shows that already a naïve random forest regression model, built on top of exploratory landscape analysis features that are computed from the initial design points, is able to recommend AFs that outperform any static choice, when considering performance over the classic BBOB benchmark suite for derivative-free numerical optimization methods on the COCO platform.

HEBO: An Empirical Study of Assumptions in Bayesian Optimisation

The findings indicate that the majority of hyper-parameter tuning tasks exhibit heteroscedasticity and non-stationarity, multiobjective acquisition ensembles with Pareto front solutions improve queried configurations, and robust acquisition maximisers afford empirical advantages relative to their non-robust counterparts.
...

References

SHOWING 1-10 OF 86 REFERENCES

Neural Architecture Search with Bayesian Optimisation and Optimal Transport

NASHBOT is developed, a Gaussian process based BO framework for neural architecture search which outperforms other alternatives for architecture search in several cross validation based model selection tasks on multi-layer perceptrons and convolutional neural networks.

Scalable Bayesian Optimization Using Deep Neural Networks

This work shows that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically, which allows for a previously intractable degree of parallelism.

Multi-fidelity Bayesian Optimisation with Continuous Approximations

This work develops a Bayesian optimisation method, BOCA, that achieves better regret than than strategies which ignore the approximations and outperforms several other baselines in synthetic and real experiments.

Bayesian Optimization with Robust Bayesian Neural Networks

This work presents a general approach for using flexible parametric models (neural networks) for Bayesian optimization, staying as close to a truly Bayesian treatment as possible and obtaining scalability through stochastic gradient Hamiltonian Monte Carlo, whose robustness is improved via a scale adaptation.

Batch Bayesian Optimization via Local Penalization

A simple heuristic based on an estimate of the Lipschitz constant is investigated that captures the most important aspect of this interaction at negligible computational overhead and compares well, in running time, with much more elaborate alternatives.

Parallelised Bayesian Optimisation via Thompson Sampling

This work design and analyse variations of the classical Thompson sampling procedure for Bayesian optimisation (BO) in settings where function evaluations are expensive but can be performed in parallel and shows that asynchronous TS outperforms a suite of existing parallel BO algorithms in simulations and in an application involving tuning hyper-parameters of a convolutional neural network.

High Dimensional Bayesian Optimisation and Bandits via Additive Models

It is demonstrated that the method outperforms naive BO on additive functions and on several examples where the function is not additive, and it is proved that, for additive functions the regret has only linear dependence on $D$ even though the function depends on all$D$ dimensions.

Bayesian Optimization with Tree-structured Dependencies

A novel surrogate model for Bayesian optimization is introduced which combines independent Gaussian Processes with a linear model that encodes a tree-based dependency structure and can transfer information between overlapping decision sequences.

Multi-Task Bayesian Optimization

This paper proposes an adaptation of a recently developed acquisition function, entropy search, to the cost-sensitive, multi-task setting and demonstrates the utility of this new acquisition function by leveraging a small dataset to explore hyper-parameter settings for a large dataset.

Noisy Blackbox Optimization with Multi-Fidelity Queries: A Tree Search Approach

This work combines structured state-space exploration through hierarchical partitioning with querying these partitions at multiple fidelities, and develops a multi-fidelity bandit based tree-search algorithm for noisy black-box optimization.
...