# A Differentially Private Kernel Two-Sample Test

@inproceedings{Raj2019ADP, title={A Differentially Private Kernel Two-Sample Test}, author={Anant Raj and Ho Chung Leon Law and D. Sejdinovic and Mijung Park}, booktitle={ECML/PKDD}, year={2019} }

Kernel two-sample testing is a useful statistical tool in determining whether data samples arise from different distributions without imposing any parametric assumptions on those distributions. However, raw data samples can expose sensitive information about individuals who participate in scientific studies, which makes the current tests vulnerable to privacy breaches. Hence, we design a new framework for kernel two-sample testing conforming to differential privacy constraints, in order to… Expand

#### 2 Citations

Application of Kernel Hypothesis Testing on Set-valued Data

- 2021

We present a general framework for kernel hypothesis testing on distributions of sets of individual examples. Sets may represent many common data sources such as groups of observations in time… Expand

MONK - Outlier-Robust Mean Embedding Estimation by Median-of-Means

- Computer Science, Mathematics
- ICML
- 2019

This paper shows how the recently emerged principle of median-of-means can be used to design estimators for kernel mean embedding and MMD with excessive resistance properties to outliers, and optimal sub-Gaussian deviation bounds under mild assumptions. Expand

#### References

SHOWING 1-10 OF 35 REFERENCES

Differentially Private Chi-Squared Hypothesis Testing: Goodness of Fit and Independence Testing

- Psychology, Mathematics
- ICML 2016
- 2016

Hypothesis testing is a useful statistical tool in determining whether a given model should be rejected based on a sample from the population. Sample data may contain sensitive information about… Expand

A Kernel Two-Sample Test

- Mathematics, Computer Science
- J. Mach. Learn. Res.
- 2012

This work proposes a framework for analyzing and comparing distributions, which is used to construct statistical tests to determine if two samples are drawn from different distributions, and presents two distribution free tests based on large deviation bounds for the maximum mean discrepancy (MMD). Expand

A Fast, Consistent Kernel Two-Sample Test

- Computer Science, Mathematics
- NIPS
- 2009

A novel estimate of the null distribution is computed, computed from the eigen-spectrum of the Gram matrix on the aggregate sample from P and Q, and having lower computational cost than the bootstrap. Expand

Optimal kernel choice for large-scale two-sample tests

- Computer Science, Mathematics
- NIPS
- 2012

The new kernel selection approach yields a more powerful test than earlier kernel selection heuristics, and makes the kernel selection and test procedures suited to data streams, where the observations cannot all be stored in memory. Expand

Fast Two-Sample Testing with Analytic Representations of Probability Measures

- Computer Science, Mathematics
- NIPS
- 2015

A class of nonparametric two-sample tests with a cost linear in the sample size based on an ensemble of distances between analytic functions representing each of the distributions that give a better power/time tradeoff than competing approaches and in some cases better outright power than even the most expensive quadratic-time tests. Expand

The Limits of Two-Party Differential Privacy

- Computer Science
- 2010 IEEE 51st Annual Symposium on Foundations of Computer Science
- 2010

Borders expose a dramatic gap between the accuracy that can be obtained by differentially private data analysis versus the accuracy obtainable when privacy is relaxed to a computational variant of differential privacy. Expand

Local Private Hypothesis Testing: Chi-Square Tests

- Mathematics, Computer Science
- ICML
- 2018

This work analyzes locally private chi-square tests for goodness of fit and independence testing, which have been studied in the traditional, curator model for differential privacy, to explore the design of private hypothesis tests in the local model. Expand

Differentially Private Learning with Kernels

- Computer Science
- ICML
- 2013

This paper derives differentially private learning algorithms with provable "utility" or error bounds from the standard learning model of releasing different private predictor using three simple but practical models. Expand

Differentially Private Empirical Risk Minimization

- Medicine, Computer Science
- J. Mach. Learn. Res.
- 2011

This work proposes a new method, objective perturbation, for privacy-preserving machine learning algorithm design, and shows that both theoretically and empirically, this method is superior to the previous state-of-the-art, output perturbations, in managing the inherent tradeoff between privacy and learning performance. Expand

Differential privacy for functions and functional data

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2013

This work shows that adding an appropriate Gaussian process to the function of interest yields differential privacy, and develops methods for releasing functions while preserving differential privacy. Expand