The case for fully Bayesian optimisation in small-sample trials

@article{Saikai2022TheCF,
  title={The case for fully Bayesian optimisation in small-sample trials},
  author={Yuji Saikai},
  journal={ArXiv},
  year={2022},
  volume={abs/2208.13960}
}
  • Yuji Saikai
  • Published 30 August 2022
  • Computer Science
  • ArXiv
While sample efficiency is the main motive for use of Bayesian optimisation when black-box functions are expensive to evaluate, the standard approach based on type II maximum likelihood (ML-II) may fail and result in disappointing performance in small-sample trials. The paper provides three compelling reasons to adopt fully Bayesian optimisation (FBO) as an alternative. First, failures of ML-II are more commonplace than implied by the existing studies using the contrived settings. Second, FBO is… 

Figures from this paper

References

SHOWING 1-10 OF 32 REFERENCES

How Bayesian should Bayesian optimisation be?

This work investigates whether a fully-Bayesian treatment of the Gaussian process hyperparameters in BO (FBBO) leads to improved optimisation performance, and recommends that FBBO using EI with an ARD kernel as the default choice for BO.

Robust Gaussian Process-Based Global Optimization Using a Fully Bayesian Expected Improvement Criterion

Numerical experiments show that the fully Bayesian approach makes EI-based optimization more robust while maintaining an average loss similar to that of the EGO algorithm.

Theoretical Analysis of Bayesian Optimisation with Unknown Gaussian Process Hyper-Parameters

A cumulative regret bound for Bayesian optimisation with Gaussian processes and unknown kernel hyper-parameters in the stochastic setting is derived, which applies to the expected improvement acquisition function and sub-Gaussian observation noise and provides guidelines on how to design hyper- parameters estimation methods.

Comparison of Approximate Methods for Handling Hyperparameters

  • D. Mackay
  • Computer Science
    Neural Computation
  • 1999
Two approximate methods for computational implementation of Bayesian hierarchical models that include unknown hyperparameters such as regularization constants and noise levels are examined, and the evidence framework is shown to introduce negligible predictive error under straightforward conditions.

Practical Bayesian Optimization of Machine Learning Algorithms

This work describes new algorithms that take into account the variable cost of learning algorithm experiments and that can leverage the presence of multiple cores for parallel experimentation and shows that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms.

Convergence Rates of Efficient Global Optimization Algorithms

  • Adam D. Bull
  • Computer Science, Mathematics
    J. Mach. Learn. Res.
  • 2011
This work provides convergence rates for expected improvement, and proposes alternative estimators, chosen to minimize the constants in the rate of convergence, and shows these estimators retain the convergence rates of a fixed prior.

Towards Gaussian Process-based Optimization with Finite Time Horizon

This work formulates the problem of optimal strategy for finite horizon sequential optimization, provides the solution to this problem in terms of a new multipoint EI, and illustrates the suboptimality of maximizing the 1-point EI at each iteration on the basis of a first counter-example.

Efficient Global Optimization of Expensive Black-Box Functions

This paper introduces the reader to a response surface methodology that is especially good at modeling the nonlinear, multimodal functions that often occur in engineering and shows how these approximating functions can be used to construct an efficient global optimization algorithm with a credible stopping rule.

Bayesian data analysis.

  • J. Kruschke
  • Political Science
    Wiley interdisciplinary reviews. Cognitive science
  • 2010
A fatal flaw of NHST is reviewed and some benefits of Bayesian data analysis are introduced and illustrative examples of multiple comparisons in Bayesian analysis of variance and Bayesian approaches to statistical power are presented.