• Corpus ID: 7964701

# Fourier-Based Testing for Families of Distributions

@article{Canonne2017FourierBasedTF,
title={Fourier-Based Testing for Families of Distributions},
author={Cl{\'e}ment L. Canonne and Ilias Diakonikolas and Alistair Stewart},
journal={Electron. Colloquium Comput. Complex.},
year={2017},
volume={TR17}
}
• Published 1 June 2017
• Mathematics, Computer Science
• Electron. Colloquium Comput. Complex.
We study the general problem of testing whether an unknown distribution belongs to a specified family of distributions. More specifically, given a distribution family $\mathcal{P}$ and sample access to an unknown discrete distribution $\mathbf{P}$, we want to distinguish (with high probability) between the case that $\mathbf{P} \in \mathcal{P}$ and the case that $\mathbf{P}$ is $\epsilon$-far, in total variation distance, from every distribution in $\mathcal{P}$. This is the prototypical…
8 Citations
• Mathematics
NeurIPS
• 2018
This work applies its Fourier-based framework to obtain near sample-optimal and computationally efficient testers for the following fundamental distribution families: Sums of Independent Integer Random Variables, Poisson Multinomial Distributions, and Discrete Log-Concave Distributions.
• Computer Science, Mathematics
Electron. Colloquium Comput. Complex.
• 2017
This work presents a computationally efficient tester whose sample complexity is optimal, up to constant factors, and a matching information-theoretic lower bound on the sample complexity of generalized uniformity testing.
• Computer Science, Mathematics
2018 Information Theory and Applications Workshop (ITA)
• 2018
This work studies the problem of testing conditional independence for discrete distributions and develops a general theory providing tight variance bounds for specific estimators of this form, up to constant factors, for all such estimators.
• Mathematics, Computer Science
Electron. Colloquium Comput. Complex.
• 2017
The new upper and lower bounds show that the optimal sample complexity of identity testing is $\Theta\left( \frac{1}{\epsilon^2}\left(\sqrt{n \log(1/\delta)} + \log (1/ \delta) \right)\right) for any$n, \ep silon$, and$\delta$. • Computer Science, Mathematics COLT • 2019 An algorithm for hypothesis testing for identity testing for multidimensional histogram distributions with sample complexity O(k/epsilon) that runs in sample-polynomial time and is robust to model misspecification, i.e., succeeds even if q is only promised to be {\em close} to a$k$-histogram. • Mathematics, Computer Science NeurIPS • 2019 The framework of property testing is used to design algorithms to test the properties of the distribution that the data is drawn from with respect to differential privacy, which indicates that differential privacy can be obtained in most regimes of parameters for free. The goal of this dissertation is to identify and address several contemporary challenges in distribution testing and make progress in answering the following questions. Property Testing and Probability Distributions: New Techniques, New Models, and New Goals Clément L. Canonne Recently there has been a lot of glorious hullabaloo about Big Data and how it is going to ## References SHOWING 1-10 OF 61 REFERENCES • Mathematics, Computer Science NIPS • 2015 This work provides a general approach via which sample-optimal and computationally efficient testers for discrete log-concave and monotone hazard rate distributions are obtained. • Mathematics, Computer Science Theory of Computing Systems • 2017 A general algorithm is developed that applies to a large range of “shape-constrained” properties, including monotone, log-concave, t-modal, piecewise-polynomial, and Poisson Binomial distributions, and is computationally efficient. • Computer Science, Mathematics COLT • 2016 An algorithm for properly learning Poisson binomial distributions and provides a novel structural characterization of PBDs, which allows the corresponding fitting problem to be reduced to a collection of systems of low-degree polynomial inequalities. • Mathematics, Computer Science SODA • 2015 A unified approach is presented that yields new, simple testers, with sample complexity that is information-theoretically optimal, for broad classes of structured distributions, including$t-flat distributions, $t$-modal distributions, log-concave distributions, monotone hazard rate (MHR) distributions, and mixtures thereof.
• Computer Science
ArXiv
• 2016
This work gives the first computationally efficient algorithm for the robust proper learning of univariate log-concave distributions, which achieves the information-theoretically optimal sample size, runs in polynomial time, and is robust to model misspecification with nearly-optimal error guarantees.
• Computer Science, Mathematics
COLT
• 2016
A computationally efficient algorithm is designed that uses $\widetilde{O}(k/\epsilon^2)$ samples, and learns an arbitrary $k$-SIIRV within error $k,$ in total variation distance, and proves a tight lower bound on the size of $\ep silon$-covers for ${\cal S}_{n,k}$, and is the key ingredient in the authors' tight sample complexity lower bound.
• Computer Science, Mathematics
2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS)
• 2016
The sample complexity of the algorithm depends on the structure of the unknown distributions - as opposed to merely their domain size - and is significantly better compared to the worst-case optimal L1-tester in many natural instances.
• Computer Science, Mathematics
SODA
• 2014
This work presents simple testers for both the e1 and e2 settings, with sample complexity that is information-theoretically optimal, to constant factors, and establishes that the sample complexity is Θ(max{n2/3/e4/3, n1/2/&epsilon2}.
• Computer Science, Mathematics
• 2014
It is proved that when $d \leq 3$ the log-concave maximum likelihood estimator achieves the minimax optimal rate (up to logarithmic factors when d = 2,3$) with respect to squared Hellinger loss. • Mathematics • 1994 If S is a random variable with finite rnean and variance, the Bienayme-Chebyshev inequality states that for x > 0,$\$\Pr \left[ {\left| {S - ES} \right| \geqslant x{{{(\operatorname{var}