• Corpus ID: 88520612

Density Estimation via Discrepancy

  title={Density Estimation via Discrepancy},
  author={Kun Yang and Hao Su and Wing Hung Wang},
  journal={arXiv: Machine Learning},
Given i.i.d samples from some unknown continuous density on hyper-rectangle $[0, 1]^d$, we attempt to learn a piecewise constant function that approximates this underlying density non-parametrically. Our density estimate is defined on a binary split of $[0, 1]^d$ and built up sequentially according to discrepancy criteria; the key ingredient is to control the discrepancy adaptively in each sub-rectangle to achieve overall bound. We prove that the estimate, even though simple as it appears… 

Figures and Tables from this paper

Offline and Online Density Estimation for Large High-Dimensional Data
This work presents development of computationally efficient algorithms for highdimensional density estimation, based on Bayesian sequential partitioning (BSP), and progressive update of the binary partitions in BBSP is proposed, which leads into improved accuracy as well as speed-up, for various block sizes.


Multivariate Density Estimation by Bayesian Sequential Partitioning
The Bayesian sequential partitioning (BSP) method proposed here is capable of providing much more accurate estimates when the sample space is of moderate to high dimension and can be used to design new classification methods competitive with the state of the art.
Entropy, Randomization, Derandomization, and Discrepancy
The star discrepancy is a measure of how uniformly distributed a finite point set is in the d-dimensional unit cube. It is related to high-dimensional numerical integration of certain function
Low Discrepancy Constructions in the Triangle
Two quasi-Monte Carlo constructions in the triangle with a vanishing discrepancy are presented, including a version of the van der Corput sequence customized to the unit triangle that attains a discrepancy below $12/{\sqrt{N}.
The inverse of the star-discrepancy depends linearly on the dimension
We study bounds on the classical ∗-discrepancy and on its inverse. Let n∞(d, e) be the inverse of the ∗-discrepancy, i.e., the minimal number of points in dimension d with the ∗-discrepancy at most
Computing Bounds for the Star Discrepancy
An algorithm to compute upper bounds for the star discrepancy of an arbitrary set of n points in the s-dimensional unit cube is proposed and improved upper bounds of some Faure (0,m,s)-nets are given.
Testing multivariate uniformity and its applications
By Monte Carlo simulation, it is found that the finite-sample distributions of the new statistics are well approximated by the standard normal distribution, N(0,1), or the chi-squared distribution, X 2 (2).
Mean Shift: A Robust Approach Toward Feature Space Analysis
It is proved the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density.
A New Randomized Algorithm to Approximate the Star Discrepancy Based on Threshold Accepting
We present a new algorithm for estimating the star discrepancy of arbitrary point sets. Similar to the algorithm for discrepancy approximation of Winker and Fang [SIAM J. Numer. Anal., 34 (1997), pp.
Decision Forests for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning
A unified, efficient model of random decision forests which can be applied to a number of machine learning, computer vision and medical image analysis tasks and how alternatives such as random ferns and extremely randomized trees stem from the more general model is discussed.
Calculation of Discrepancy Measures and Applications
In this book chapter we survey known approaches and algorithms to compute discrepancy measures of point sets. After providing an introduction which puts the calculation of discrepancy measures in a