# MLMCMC – Multilevel Markov Chain Monte Carlo

- 2013

#### Abstract

In this talk we address the problem of the prohibitively large computational cost of existing Markov chain Monte Carlo (MCMC) methods for large–scale applications with high dimensional parameter spaces, e.g. uncertainty quantification in porous media flow. We propose a new multilevel Metropolis-Hastings algorithm, and give an abstract, problem dependent theorem on the cost of the new multilevel estima-tor based on a set of simple, verifiable assumptions. For a typical model problem in subsurface flow, we then provide a detailed analysis of these assumptions and show significant gains over the standard Metropolis-Hastings estimator. The parameters in mathematical models for many physical processes are often impossible to determine fully or accurately, and are hence subject to uncertainty. It is of great importance to quantify the uncertainty in the model outputs based on the (uncertain) information that is available on the model inputs. A popular way to achieve this is stochastic modelling. Based on the available information, a probability distribution (the prior in the Bayesian framework) is assigned to the input parameters. If in addition, some dynamic data (or observations) related to the model outputs are available, it is possible to reduce the overall uncertainty and to get a better representation of the model by conditioning the prior distribution on this data (leading to the posterior). In most situations, however, the posterior distribution is intractable in the sense that exact sampling from it is unavailable. One way to circumvent this problem, is to generate samples using a Metropolis– Hastings type MCMC approach [9], which consists of two main steps: (i) given the previous sample, a new sample is generated according to some proposal distribution , such as a random walk; (ii) the likelihood of this new sample (the data fit) is compared to the likelihood of the previous sample. Based on this comparison, the proposed sample is then either accepted and used for inference, or it is rejected and we use instead the previous sample again, leading to a Markov chain. A major problem with MCMC is the high cost of the likelihood calculation for large-scale applications, since it commonly involves the numerical solution of a partial differential equation (PDE) with highly varying coefficients (for accuracy reasons usually) on a very fine spatial grid. Due to the slow convergence of Monte Carlo averaging, the number of samples is also large and moreover, the likelihood has to be calculated not only for the samples …