Scalable3-BO: Big Data meets HPC - A scalable asynchronous parallel high-dimensional Bayesian optimization framework on supercomputers

  title={Scalable3-BO: Big Data meets HPC - A scalable asynchronous parallel high-dimensional Bayesian optimization framework on supercomputers},
  author={Anh Tran},
  • Anh Tran
  • Published 12 August 2021
  • Computer Science, Mathematics
  • ArXiv
Bayesian optimization (BO) is a flexible and powerful framework that is suitable for computationally expensive simulation-based applications and guarantees statistical convergence to the global optimum. While remaining as one of the most popular optimization methods, its capability is hindered by the size of data, the dimensionality of the considered problem, and the nature of sequential optimization. These scalability issues are intertwined with each other and must be tackled simultaneously… 


aphBO-2GP-3B: A budgeted asynchronously-parallel multi-acquisition for known/unknown constrained Bayesian optimization on high-performing computing architecture
An asynchronous constrained batch-parallel Bayesian optimization method is proposed to efficiently solve the computationally-expensive simulation-based optimization problems on the HPC platform, with a budgeted computational resource, where the maximum number of simulations is a constant.
When Gaussian Process Meets Big Data: A Review of Scalable GPs
This article is devoted to reviewing state-of-the-art scalable GPs involving two main categories: global approximations that distillate the entire data and local approximation that divide the data for subspace learning.
ExaGeoStat: A High Performance Unified Framework for Geostatistics on Manycore Systems
The ExaGeoStat framework takes a first step in the merger of large-scale data analytics and extreme computing for geospatial statistical applications, to be followed by additional complexity reducing improvements from the solver side that can be implemented under the same interface.
Scalable Global Optimization via Local Bayesian Optimization
The TuRBO algorithm is proposed that fits a collection of local models and performs a principled global allocation of samples across these models via an implicit bandit approach and outperforms state-of-the-art methods from machine learning and operations research on problems spanning reinforcement learning, robotics, and the natural sciences.
Optimally Weighted Cluster Kriging for Big Data Regression
This work introduces a hybrid approach in which a number of Kriging models built on disjoint subsets of the data are properly weighted for the predictions and performs equally well in terms of accuracy.
Parallel Approximation of the Maximum Likelihood Estimation for the Prediction of Large-Scale Geostatistics Simulations
The Exascale GeoStatistics software framework is extended to support the Tile Low-Rank (TLR) approximation technique, which exploits the data sparsity of the dense covariance matrix by compressing the off-diagonal tiles up to a user-defined accuracy threshold, which may ultimately reduce the arithmetic complexity of the maximum likelihood estimation and the corresponding memory footprint.
Globally Approximate Gaussian Processes for Big Data With Application to Data-Driven Metamaterials Design
GAGP achieves very high predictive power matching (and in some cases exceeding) that of state-of-the-art supervised learning methods, making it particularly useful in engineering design with big data.
Tile Low Rank Cholesky Factorization for Climate/Weather Modeling Applications on Manycore Architectures
A new and flexible tile row rank Cholesky factorization is designed and a high performance implementation using OpenMP task-based programming model on various leading-edge manycore architectures is proposed, representing an important milestone in enabling large-scale simulations for covariance-based scientific applications.
Asynchronous Parallel Bayesian Optimisation via Thompson Sampling
This work designs and analyse variations of the classical Thompson sampling procedure for Bayesian optimisation (BO) in settings where function evaluations are expensive, but can be performed in parallel, and shows that asynchronous TS outperforms a suite of existing parallel BO algorithms in simulations and in a hyper-parameter tuning application in convolutional neural networks.
Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization
This work develops GP-BUCB, a principled algorithm for choosing batches, based on the GP-UCB algorithm for sequential GP optimization, and proves a surprising result; as compared to the sequential approach, the cumulative regret of the parallel algorithm only increases by a constant factor independent of the batch size B.