Density estimation with distribution element trees

  title={Density estimation with distribution element trees},
  author={Daniel W. Meyer},
  journal={Statistics and Computing},
  • D. Meyer
  • Published 2 October 2016
  • Computer Science
  • Statistics and Computing
The estimation of probability densities based on available data is a central task in many statistical applications. Especially in the case of large ensembles with many samples or high-dimensional sample spaces, computationally efficient methods are needed. We propose a new method that is based on a decomposition of the unknown distribution in terms of so-called distribution elements (DEs). These elements enable an adaptive and hierarchical discretization of the sample space with small or large… 
Density estimation of multivariate samples using Wasserstein distance
  • E. Luini, P. Arbenz
  • Computer Science, Mathematics
    Journal of Statistical Computation and Simulation
  • 2019
Since the resulting density estimator requires significantly less memory to be stored, it can be used in a situation where the information contained in a multivariate sample needs to be preserved, transferred or analysed.
Offline and Online Density Estimation for Large High-Dimensional Data
This work presents development of computationally efficient algorithms for highdimensional density estimation, based on Bayesian sequential partitioning (BSP), and progressive update of the binary partitions in BBSP is proposed, which leads into improved accuracy as well as speed-up, for various block sizes.
(Un)Conditional Sample Generation Based on Distribution Element Trees
  • D. Meyer
  • Computer Science
    Journal of Computational and Graphical Statistics
  • 2018
This work demonstrates that the DET formulation promotes an easy and inexpensive way to generate random samples similar to a smooth bootstrap, which can be generated unconditionally, but also conditionally using available information about certain probability-space components.
Learning Weighted Model Integration Distributions
This work proposes LARIAT, a novel method to tackle the problem of learning a structured support and combines the latter with a density learned using a state-of-the-art estimation method, and automatically accounts for the discontinuous nature of the underlying structured distribution.
An Example of Augmenting Regional Sensitivity Analysis Using Machine Learning Software
Regional sensitivity analysis, RSA, has been widely applied in assessing the parametric sensitivity of environmental and hydrological models, in part because of its inherent simplicity. In that
Dependence structure estimation using Copula Recursive Trees
Investigation of gas separation technique based on selective rotational excitation of different species by a laser
In this work, a gas separation approach based on the selective rotational excitation of different species is investigated. The presented method is particularly suitable for separating gases of
Data-based modeling of gas-surface interaction in rarefied gas flow simulations
In this work, a data-based approach to gas-surface interaction modeling, which employs the recently introduced distribution element tree (DET) method, is proposed. The DET method allows efficient d...


Density Estimation Trees
This work states that density estimation in high dimensions remains a challenging problem due to the curse of dimensionality, which affects the convergence rates of many popular density estimation techniques.
Polynomial Histograms for Multivariate Density and Mode Estimation
First‐ and second‐order polynomial histogram estimators for a general d‐dimensional setting are presented and pointwise bias and variance of these estimators, their asymptotic mean integrated square error (AMISE), and optimal binwidth are included.
Nonparametric multivariate density estimation using mixtures
A new method is proposed for nonparametric multivariate density estimation, which extends a general framework that has been recently developed in the univariate case based onNonparametric and semiparametric mixture distributions, and performs remarkably better than kernel-based density estimators.
An exact and easily computable expression for the mean integrated squared error (MISE) for the kernel estimator of a general normal mixture density, is given for Gaussian kernels of arbitrary order.
A study of logspline density estimation
Density Estimation in Infinite Dimensional Exponential Families
The main goal of the paper is to estimate an unknown density, $p_0$ through an element in $\mathcal{P}$, which involves solving a simple finite-dimensional linear system and it is demonstrated that the proposed estimator outperforms the non-parametric kernel density estimator and grows as $d$ increases.
Coupling Optional Pólya Trees and the Two Sample Problem
This work proposes a theoretical framework for inference that addresses challenges in the form of a prior for Bayesian nonparametric analysis based on a random-partition-and-assignment procedure similar to the one that defines the standard optional Pólya tree distribution, but has the ability to generate multiple random distributions jointly.
Functional data analysis for density functions by transformation to a Hilbert space
Functional data that are nonnegative and have a constrained integral can be considered as samples of one-dimensional density functions. Such data are ubiquitous. Due to the inherent constraints,