• Corpus ID: 4445161

Stein Points

  title={Stein Points},
  author={Wilson Ye Chen and Lester W. Mackey and Jackson Gorham and François-Xavier Briol and Chris. J. Oates},
An important task in computational statistics and machine learning is to approximate a posterior distribution $p(x)$ with an empirical measure supported on a set of representative points $\{x_i\}_{i=1}^n. [] Key Result The idea is to exploit either a greedy or a conditional gradient method to iteratively minimise a kernel Stein discrepancy between the empirical measure and $p(x)$. Our empirical results demonstrate that Stein Points enable accurate approximation of the posterior at modest computational cost…

Figures from this paper

Stein Point Markov Chain Monte Carlo

This paper removes the need to solve this optimisation problem by selecting each new point based on a Markov chain sample path, which significantly reduces the computational cost of Stein Points and leads to a suite of algorithms that are straightforward to implement.

On the geometry of Stein variational gradient descent

This paper focuses on the recently introduced Stein variational gradient descent methodology, a class of algorithms that rely on iterated steepest descent steps with respect to a reproducing kernel Hilbert space norm, and considers certain nondifferentiable kernels with adjusted tails.

Optimal thinning of MCMC output

  • M. RiabizW. Chen C. Oates
  • Computer Science
    Journal of the Royal Statistical Society: Series B (Statistical Methodology)
  • 2022
A novel method is proposed, based on greedy minimisation of a kernel Stein discrepancy, that is suitable for problems where heavy compression is required and its effectiveness is demonstrated in the challenging context of parameter inference for ordinary differential equations.

Gradient-Free Kernel Stein Discrepancy

Stein discrepancies have emerged as a powerful statistical tool, being applied to fundamental statistical problems including parameter inference, goodness-of-fit testing, and sampling. The canonical

Bayesian Posterior Approximation via Greedy Particle Optimization

A novel method named maximum mean discrepancy minimization by the Frank-Wolfe algorithm (MMD-FW), which minimizes MMD in a greedy way by the FW algorithm is proposed and it is shown that its finite sample convergence bound is in a linear order in finite dimensions.

The reproducing Stein kernel approach for post-hoc corrected sampling

Stein importance sampling is a widely applicable technique based on kernelized Stein discrepancy, which corrects the output of approximate sampling algorithms by reweighting the empirical

Projected Stein Variational Gradient Descent

This work proposes a projected Stein variational gradient descent (pSVGD) method to overcome the curse of dimensionality by exploiting the fundamental property of intrinsic low dimensionality of the data informed subspace stemming from ill-posedness of such problems.

Model Inference with Stein Density Ratio Estimation

The estimated density ratio allows us to compute the likelihood ratio function which is a surrogate to the actual Kullback-Leibler divergence from model to data, and can perform model fitting and inference from either frequentist or Bayesian point of view.

Kernel Stein Discrepancy Descent

The convergence properties of KSD Descent are studied and its practical relevance is demonstrated, but failure cases are highlighted by showing that the algorithm can get stuck in spurious local minima.

A Riemann–Stein kernel method

This paper proposes and studies a numerical method for approximation of posterior expectations based on interpolation with a Stein reproducing kernel. Finite-sample-size bounds on the approximation



Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

We propose a general purpose variational inference algorithm that forms a natural counterpart of gradient descent for optimization. Our method iteratively transports a set of particles to match the

Measuring Sample Quality with Kernels

A theory of weak convergence for K SDs based on Stein's method is developed, it is demonstrated that commonly used KSDs fail to detect non-convergence even for Gaussian targets, and it is shown that kernels with slowly decaying tails provably determine convergence for a large class of target distributions.

Support points

A new way to compact a continuous probability distribution into a set of representative points called support points, obtained by minimizing the energy distance, which can be formulated as a difference-of-convex program, which is manipulated using two algorithms to efficiently generate representative point sets.

Posterior Integration on a Riemannian Manifold

The geodesic Markov chain Monte Carlo method and its variants enable computation of integrals with respect to a posterior supported on a manifold. However, for regular integrals, the convergence rate

Control functionals for Monte Carlo integration

A non‐parametric extension of control variates is presented. These leverage gradient information on the sampling density to achieve substantial variance reduction. It is not required that the

Stein Variational Gradient Descent as Gradient Flow

This paper develops the first theoretical analysis on SVGD, discussing its weak convergence properties and showing that its asymptotic behavior is captured by a gradient flow of the KL divergence functional under a new metric structure induced by Stein operator.

Measuring Sample Quality with Diffusions

This work develops computable and convergence-determining diffusion Stein discrepancies for log-concave, heavy-tailed, and multimodal targets and uses these quality measures to select the hyperparameters of biased samplers, compare random and deterministic quadrature rules, and quantify bias-variance tradeoffs in approximate Markov chain Monte Carlo.

A Simple Lemma on Greedy Approximation in Hilbert Space and Convergence Rates for Projection Pursuit Regression and Neural Network Training

A general convergence criterion for certain iterative sequences in Hilbert space is presented. For an important subclass of these sequences, estimates of the rate of convergence are given. Under very

Measuring Sample Quality with Stein's Method

This work introduces a new computable quality measure based on Stein's method that quantifies the maximum discrepancy between sample and target expectations over a large class of test functions and uses this tool to compare exact, biased, and deterministic sample sequences.

A Kernel Test of Goodness of Fit

A nonparametric statistical test for goodness-of-fit is proposed: given a set of samples, the test determines how likely it is that these were generated from a target density function, taking the form of a V-statistic in terms of the log gradients of the target density and the kernel.