Fourier-Based Testing for Families of Distributions
@article{Canonne2017FourierBasedTF, title={Fourier-Based Testing for Families of Distributions}, author={Cl{\'e}ment L. Canonne and Ilias Diakonikolas and Alistair Stewart}, journal={Electron. Colloquium Comput. Complex.}, year={2017}, volume={TR17} }
We study the general problem of testing whether an unknown distribution belongs to a specified family of distributions. More specifically, given a distribution family $\mathcal{P}$ and sample access to an unknown discrete distribution $\mathbf{P}$, we want to distinguish (with high probability) between the case that $\mathbf{P} \in \mathcal{P}$ and the case that $\mathbf{P}$ is $\epsilon$-far, in total variation distance, from every distribution in $\mathcal{P}$. This is the prototypical…
8 Citations
Testing for Families of Distributions via the Fourier Transform
- MathematicsNeurIPS
- 2018
This work applies its Fourier-based framework to obtain near sample-optimal and computationally efficient testers for the following fundamental distribution families: Sums of Independent Integer Random Variables, Poisson Multinomial Distributions, and Discrete Log-Concave Distributions.
Sharp Bounds for Generalized Uniformity Testing
- Computer Science, MathematicsElectron. Colloquium Comput. Complex.
- 2017
This work presents a computationally efficient tester whose sample complexity is optimal, up to constant factors, and a matching information-theoretic lower bound on the sample complexity of generalized uniformity testing.
Testing Conditional Independence of Discrete Distributions
- Computer Science, Mathematics2018 Information Theory and Applications Workshop (ITA)
- 2018
This work studies the problem of testing conditional independence for discrete distributions and develops a general theory providing tight variance bounds for specific estimators of this form, up to constant factors, for all such estimators.
Sample-Optimal Identity Testing with High Probability
- Mathematics, Computer ScienceElectron. Colloquium Comput. Complex.
- 2017
The new upper and lower bounds show that the optimal sample complexity of identity testing is $\Theta\left( \frac{1}{\epsilon^2}\left(\sqrt{n \log(1/\delta)} + \log (1/ \delta) \right)\right) for any $n, \ep silon$, and $\delta$.
Testing Identity of Multidimensional Histograms
- Computer Science, MathematicsCOLT
- 2019
An algorithm for hypothesis testing for identity testing for multidimensional histogram distributions with sample complexity O(k/epsilon) that runs in sample-polynomial time and is robust to model misspecification, i.e., succeeds even if q is only promised to be {\em close} to a $k$-histogram.
Private Testing of Distributions via Sample Permutations
- Mathematics, Computer ScienceNeurIPS
- 2019
The framework of property testing is used to design algorithms to test the properties of the distribution that the data is drawn from with respect to differential privacy, which indicates that differential privacy can be obtained in most regimes of parameters for free.
Modern challenges in distribution testing
- Computer Science
- 2018
The goal of this dissertation is to identify and address several contemporary challenges in distribution testing and make progress in answering the following questions.
Property Testing and Probability Distributions: New Techniques, New Models, and New Goals
- Economics
- 2017
Property Testing and Probability Distributions: New Techniques, New Models, and New Goals Clément L. Canonne Recently there has been a lot of glorious hullabaloo about Big Data and how it is going to…
References
SHOWING 1-10 OF 61 REFERENCES
Optimal Testing for Properties of Distributions
- Mathematics, Computer ScienceNIPS
- 2015
This work provides a general approach via which sample-optimal and computationally efficient testers for discrete log-concave and monotone hazard rate distributions are obtained.
Testing Shape Restrictions of Discrete Distributions
- Mathematics, Computer ScienceTheory of Computing Systems
- 2017
A general algorithm is developed that applies to a large range of “shape-constrained” properties, including monotone, log-concave, t-modal, piecewise-polynomial, and Poisson Binomial distributions, and is computationally efficient.
Properly Learning Poisson Binomial Distributions in Almost Polynomial Time
- Computer Science, MathematicsCOLT
- 2016
An algorithm for properly learning Poisson binomial distributions and provides a novel structural characterization of PBDs, which allows the corresponding fitting problem to be reduced to a collection of systems of low-degree polynomial inequalities.
Testing Identity of Structured Distributions
- Mathematics, Computer ScienceSODA
- 2015
A unified approach is presented that yields new, simple testers, with sample complexity that is information-theoretically optimal, for broad classes of structured distributions, including $t-flat distributions, $t$-modal distributions, log-concave distributions, monotone hazard rate (MHR) distributions, and mixtures thereof.
Efficient Robust Proper Learning of Log-concave Distributions
- Computer ScienceArXiv
- 2016
This work gives the first computationally efficient algorithm for the robust proper learning of univariate log-concave distributions, which achieves the information-theoretically optimal sample size, runs in polynomial time, and is robust to model misspecification with nearly-optimal error guarantees.
Optimal Learning via the Fourier Transform for Sums of Independent Integer Random Variables
- Computer Science, MathematicsCOLT
- 2016
A computationally efficient algorithm is designed that uses $\widetilde{O}(k/\epsilon^2)$ samples, and learns an arbitrary $k$-SIIRV within error $k,$ in total variation distance, and proves a tight lower bound on the size of $\ep silon$-covers for ${\cal S}_{n,k}$, and is the key ingredient in the authors' tight sample complexity lower bound.
A New Approach for Testing Properties of Discrete Distributions
- Computer Science, Mathematics2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS)
- 2016
The sample complexity of the algorithm depends on the structure of the unknown distributions - as opposed to merely their domain size - and is significantly better compared to the worst-case optimal L1-tester in many natural instances.
Optimal Algorithms for Testing Closeness of Discrete Distributions
- Computer Science, MathematicsSODA
- 2014
This work presents simple testers for both the e1 and e2 settings, with sample complexity that is information-theoretically optimal, to constant factors, and establishes that the sample complexity is Θ(max{n2/3/e4/3, n1/2/&epsilon2}.
Global rates of convergence in log-concave density estimation
- Computer Science, Mathematics
- 2014
It is proved that when $d \leq 3$ the log-concave maximum likelihood estimator achieves the minimax optimal rate (up to logarithmic factors when d = 2,3$) with respect to squared Hellinger loss.
Probability Inequalities for Sums of Bounded Random Variables
- Mathematics
- 1994
If S is a random variable with finite rnean and variance, the Bienayme-Chebyshev inequality states that for x > 0,
$$\Pr \left[ {\left| {S - ES} \right| \geqslant x{{{(\operatorname{var}…