Bayesian Optimization Over Iterative Learners with Structured Responses: A Budget-aware Planning Approach

  title={Bayesian Optimization Over Iterative Learners with Structured Responses: A Budget-aware Planning Approach},
  author={Syrine Belakaria and Rishit Sheth and Janardhan Rao Doppa and Nicol{\'o} Fusi},
The rising growth of deep neural networks (DNNs) and datasets in size motivates the need for efficient solutions for simultaneous model selection and training. Many methods for hyperparameter optimization (HPO) of iterative learners including DNNs attempt to solve this problem by querying and learning a response surface while searching for the optimum of that surface. However, many of these methods make myopic queries, do not consider prior knowledge about the response structure, and/or perform… 

Figures from this paper



Multi-Step Budgeted Bayesian Optimization with Unknown Evaluation Costs

This work proposes the budgeted multi-step expected improvement, a non-myopic acquisition function that generalizes classical expected improvement to the setting of heterogeneous and unknown evaluation costs, and shows that the acquisition function outperforms existing methods in a variety of synthetic and real problems.

Bayesian Optimization for Iterative Learning

This paper proposes to learn an evaluation function compressing learning progress at any stage of the training process into a single numeric score according to both training success and stability, and presents a Bayesian optimization approach which exploits the iterative structure of learning algorithms for efficient hyperparameter tuning.

BINOCULARS for efficient, nonmyopic sequential experimental design

The key idea is simple and surprisingly effective: first compute a one-step optimal batch of experiments, then select a single point from this batch to evaluate, and it is demonstrated that BINOCULARS significantly outperforms myopic alternatives in real-world scenarios.

Gaussian processes with linear operator inequality constraints

  • C. Agrell
  • Computer Science, Mathematics
    J. Mach. Learn. Res.
  • 2019
This paper adopts the approach of using a sufficiently dense set of virtual observation locations where the constraint is required to hold, and derive the exact posterior for a conjugate likelihood of the constrained Gaussian Process.

BOHB: Robust and Efficient Hyperparameter Optimization at Scale

This work proposes a new practical state-of-the-art hyperparameter optimization method, which consistently outperforms both Bayesian optimization and Hyperband on a wide range of problem types, including high-dimensional toy functions, support vector machines, feed-forward neural networks, Bayesian Neural networks, deep reinforcement learning, and convolutional neural networks.

Maximizing acquisition functions for Bayesian optimization

This work shows that acquisition functions estimated via Monte Carlo integration are consistently amenable to gradient-based optimization and identifies a common family of acquisition functions, including EI and UCB, whose characteristics not only facilitate but justify use of greedy approaches for their maximization.

A Nonmyopic Approach to Cost-Constrained Bayesian Optimization

This paper forms cost-constrained BO as a constrained Markov decision process (CMDP), and develops an efficient rollout approximation to the optimal CMDP policy that takes both the cost and future iterations into account.

Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees

This paper provides the first efficient implementation of general multi-step lookahead Bayesian optimization, formulated as a sequence of nested optimization problems within a multi- step scenario tree, and equivalently optimize all decision variables in the full tree jointly, in a ``one-shot'' fashion.

Bayesian Optimization Meets Bayesian Optimal Stopping

This paper proposes to unify BO (specifically, Gaussian process-upper confidence bound (GP-UCB) with Bayesian optimal stopping (BO-BOS) to boost the epoch efficiency of BO and empirically evaluates the performance of BO-B OS and demonstrates its generality in hyperparameter optimization of ML models and two other interesting applications.

Freeze-Thaw Bayesian Optimization

This paper develops a dynamic form of Bayesian optimization for machine learning models with the goal of rapidly finding good hyperparameter settings and provides an information-theoretic framework to automate the decision process.