Share This Author
Batch Bayesian Optimization via Local Penalization
A simple heuristic based on an estimate of the Lipschitz constant is investigated that captures the most important aspect of this interaction at negligible computational overhead and compares well, in running time, with much more elaborate alternatives.
Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets
A generative model for the validation error as a function of training set size is proposed, which learns during the optimization process and allows exploration of preliminary configurations on small subsets, by extrapolating to the full dataset.
Entropy Search for Information-Efficient Global Optimization
This paper develops desiderata for probabilistic optimization algorithms, then presents a concrete algorithm which addresses each of the computational intractabilities with a sequence of approximations and explicitly addresses the decision problem of maximizing information gain from each evaluation.
The Randomized Dependence Coefficient
The Randomized Dependence Coefficient is introduced, a measure of nonlinear dependence between random variables of arbitrary dimension based on the Hirschfeld-Gebelein-Renyi Maximum Correlation Coefficient, which has low computational cost and is easy to implement.
Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences
- Motonobu Kanagawa, Philipp Hennig, D. Sejdinovic, Bharath K. Sriperumbudur
- Computer ScienceArXiv
- 6 July 2018
This paper is an attempt to bridge the conceptual gaps between researchers working on the two widely used approaches based on positive definite kernels: Bayesian learning or inference using Gaussian…
Probabilistic numerics and uncertainty in computations
- Philipp Hennig, Michael A. Osborne, M. Girolami
- Computer ScienceProceedings of the Royal Society A: Mathematical…
- 3 June 2015
It is shown that the probabilistic view suggests new algorithms that can flexibly be adapted to suit application specifics, while delivering improved empirical performance.
Probabilistic Line Searches for Stochastic Optimization
A probabilistic line search is constructed by combining the structure of existing deterministic methods with notions from Bayesian optimization, which retains a Gaussian process surrogate of the univariate optimization objective, and uses a probabilism belief over the Wolfe conditions to monitor the descent.
Dissecting Adam: The Sign, Magnitude and Variance of Stochastic Gradients
This analysis extends recent results on adverse effects of ADAM on generalization, isolating the sign aspect as the problematic one and transfers the variance adaptation to SGD gives rise to a novel method, completing the practitioner's toolbox for problems where ADAM fails.
Limitations of the empirical Fisher approximation for natural gradient descent
It is argued that the conditions under which the empirical Fisher approaches the Fisher (and the Hessian) are unlikely to be met in practice, and that, even on simple optimization problems, the pathologies of the empirical fisher can have undesirable effects.
Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature
- T. Gunter, Michael A. Osborne, R. Garnett, Philipp Hennig, S. Roberts
- Computer ScienceNIPS
- 3 November 2014
A warped model for probabilistic integrands (likelihoods) that are known to be non-negative are introduced, permitting a cheap active learning scheme to optimally select sample locations.