Massive optimal data compression and density estimation for scalable, likelihood-free inference in cosmology

  title={Massive optimal data compression and density estimation for scalable, likelihood-free inference in cosmology},
  author={Justin Alsing and Benjamin Dan Wandelt and Stephen M. Feeney},
  journal={Monthly Notices of the Royal Astronomical Society},
Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to… 

Figures from this paper

Fast likelihood-free cosmology with neural density estimators and active learning
NDEs are used to learn the likelihood function from a set of simulated datasets, with active learning to adaptively acquire simulations in the most relevant regions of parameter space on-the-fly, demonstrating the approach on a number of cosmological case studies.
Bayesian optimization for likelihood-free cosmological inference
This work addresses the problem of performing likelihood-free Bayesian inference from black-box simulation-based models, under the constraint of a very limited simulation budget, and adopts an approach based on the likelihood of an alternative parametric model.
Fast and Credible Likelihood-Free Cosmology with Truncated Marginal Neural Ratio Estimation
This paper shows that tmnre can achieve converged posteriors using orders of magnitude fewer simulator calls than conventional Markov Chain Monte Carlo methods, and promises to become a powerful tool for cosmological data analysis, particularly in the context of extended cosmologies.
Parameter inference and model comparison using theoretical predictions from noisy simulations
This work shows how to correct the likelihood in the presence of an estimated summary statistic by marginalizing over the true summary statistic in the framework of a Bayesian hierarchical model and presents an alteration to the Sellentin–Heavens corrected likelihood.
Nuisance hardened data compression for fast likelihood-free inference
We show how nuisance parameter marginalized posteriors can be inferred directly from simulations in a likelihood-free setting, without having to jointly infer the higher dimensional interesting and
Likelihood-free Forward Modeling for Cluster Weak Lensing and Cosmology
Likelihood-free inference provides a rigorous approach to performing Bayesian analysis using forward simulations only. The main advantage of likelihood-free methods is their ability to account for
Primordial power spectrum and cosmology from black-box galaxy surveys
We propose a new, likelihood-free approach to inferring the primordial matter power spectrum and cosmological parameters from arbitrarily complex forward models of galaxy surveys where all relevant
Mining gold from implicit models to improve likelihood-free inference
Inference techniques for this case are presented that combine the insight that additional latent information can be extracted from the simulator with the power of neural networks in regression and density estimation tasks, leading to better sample efficiency and quality of inference.
Gaussbock: Fast Parallel-iterative Cosmological Parameter Estimation with Bayesian Nonparametrics
We present and apply Gaussbock, a new embarrassingly parallel iterative algorithm for cosmological parameter estimation designed for an era of cheap parallel-computing resources. Gaussbock uses
Likelihood-free Inference of Fornax Dark Matter Density Profile
The standard model of cosmology ΛCDM predicts that dark matter (DM) density profile should diverge as r−1 at the center of dwarf galaxies (cusp), while the observations tend to suggest a flatter


Approximate Bayesian computation (ABC) methods are presented and discussed in the context of supernova cosmology using data from the SDSS-II Supernova Survey and it is demonstrated that ABC can recover an accurate posterior distribution.
Approximate Bayesian computation in large-scale structure : constraining the galaxy-halo connection
This work demonstrates that ABC is feasible for LSS parameter inference by using it to constrain parameters of the halo occupation distribution (HOD) model for populating dark matter halos with galaxies and suggests that ABC can and should be applied in parameter inference for L SS analyses.
cosmoabc: Likelihood-free inference via Population Monte Carlo Approximate Bayesian Computation
Likelihood-Free Inference in Cosmology: Potential for the Estimation of Luminosity Functions
This paper will present an overview of methods that allow a likelihood-free approach to inference, with emphasis on approximate Bayesian computation, a class of procedures originally motivated by similar inference problems in population genetics.
Massive data compression for parameter-dependent covariance matrices
MOPED can be used to reduce, by orders of magnitude, the number of simulated datasets that are required to estimate the covariance matrix required for the analysis of gaussian-distributed data, making an otherwise intractable analysis feasible.
Multimodal nested sampling: an efficient and robust alternative to Markov Chain Monte Carlo methods for astronomical data analyses
Three new methods for sampling and evidence evaluation from distributions that may contain multiple modes and significant degeneracies in very high dimensions are presented, leading to a further substantial improvement in sampling efficiency and robustness and an even more efficient technique for estimating the uncertainty on the evaluated evidence.
Accelerating Approximate Bayesian Computation with Quantile Regression: application to cosmological redshift distributions
A novel method, which is called qABC, to accelerate ABC with Quantile Regression, which creates a model of quantiles of distance measure as a function of input parameters and applies it to the practical problem of estimation of redshift distribution of cosmological samples.
Massive lossless data compression and multiple parameter estimation from galaxy spectra
We present a method for radical linear compression of data sets where the data are dependent on some number M of parameters. We show that, if the noise in the data is independent of the parameters,
Generalized massive optimal data compression
This paper provides a general procedure for optimally compressing data down to summary statistics, showing that compression to the score function -- the gradient of the log-likelihood with respect to the parameters -- yields compressed statistics that are optimal in the sense that they preserve the Fisher information content of the data.