Corpus ID: 25736089

Learning Powers of Poisson Binomial Distributions

  title={Learning Powers of Poisson Binomial Distributions},
  author={Dimitris Fotakis and Vasilis Kontonis and Piotr Krysta and Paul G. Spirakis},
We introduce the problem of simultaneously learning all powers of a Poisson Binomial Distribution (PBD). A PBD of order $n$ is the distribution of a sum of $n$ mutually independent Bernoulli random variables $X_i$, where $\mathbb{E}[X_i] = p_i$. The $k$'th power of this distribution, for $k$ in a range $[m]$, is the distribution of $P_k = \sum_{i=1}^n X_i^{(k)}$, where each Bernoulli random variable $X_i^{(k)}$ has $\mathbb{E}[X_i^{(k)}] = (p_i)^k$. The learning algorithm can query any power… Expand


Properly Learning Poisson Binomial Distributions in Almost Polynomial Time
An algorithm for properly learning Poisson binomial distributions and provides a novel structural characterization of PBDs, which allows the corresponding fitting problem to be reduced to a collection of systems of low-degree polynomial inequalities. Expand
Optimal Learning via the Fourier Transform for Sums of Independent Integer Random Variables
A computationally efficient algorithm is designed that uses $\widetilde{O}(k/\epsilon^2)$ samples, and learns an arbitrary $k$-SIIRV within error $k,$ in total variation distance, and proves a tight lower bound on the size of $\ep silon$-covers for ${\cal S}_{n,k}$, and is the key ingredient in the authors' tight sample complexity lower bound. Expand
Learning Poisson Binomial Distributions
This work considers a basic problem in unsupervised learning: learning an unknown Poisson binomial distribution, and gives a highly efficient algorithm which learns to $$\epsilon $$ϵ-accuracy (with respect to the total variation distance) using $$\tilde{O}(1/ \ep silon ^{3})$$O~( 1/ϵ3) samples independent of$$n$$n. Expand
Testing Poisson Binomial Distributions
The sample complexity of this algorithm improves quadratically upon that of the naive "learn followed by tolerant-test" approach, while instance optimal identity testing [VV14] is not applicable since it is looking to simultaneously test against a whole family of distributions. Expand
Learning Coverage Functions and Private Release of Marginals
This work gives the first fully-polynomial algorithm for learning an interesting class of functions in the demanding PMAC model, based on several new structural properties of coverage functions, and shows that coverage functions are learnable agnostically with excess $\ell_1$-error $\epsilon$ over all product and symmetric distributions in time $n^{\log(1/\ep silon)}$. Expand
Sparse covers for sums of indicators
It is shown that the set of Poisson Binomial distributions on n variables admits a proper Epsilon-cover in total variation distance of size n2+n·(1/ϵ)O(log2(1 /ϵ), which can be computed in polynomial time. Expand
The fourier transform of poisson multinomial distributions and its algorithmic applications
A refined structural understanding of PMDs is obtained by analyzing their Fourier transform and a multivariate central limit theorem (CLT) is proved that relates an arbitrary PMD to a discretized multivariate Gaussian with the same mean and covariance, in total variation distance. Expand
Learning Sums of Independent Integer Random Variables
The heart of the learning result is a new limit theorem which characterizes what the sum of an arbitrary number of arbitrary independent {0, 1, ... , k-1}-valued random variables may look like. Expand
On the Structure, Covering, and Learning of Poisson Multinomial Distributions
This work generalizes the single-dimensional result of Daskalakis, Diakonikolas and Servedio for Poisson binomials to arbitrary dimension and shows that (n, k)-PMDs can be learned to within ε in total variation distance from Õk(1/ε) samples, which is near-optimal in terms of dependence on ε and independent of n. Expand
A size-free CLT for poisson multinomials and its applications
This work shows that any (n,k)-PMD is poly(k/σ)-close in total variation distance to the (appropriately discretized) multi-dimensional Gaussian with the same first two moments, removing the dependence on n from the Central Limit Theorem of Valiant and Valiant. Expand