Corpus ID: 237572337

The power of private likelihood-ratio tests for goodness-of-fit in frequency tables

  title={The power of private likelihood-ratio tests for goodness-of-fit in frequency tables},
  author={Emanuele Dolera and Stefano Favaro},
Privacy-protecting data analysis investigates statistical methods under privacy constraints. This is a rising challenge in modern statistics, as the achievement of confidentiality guarantees, which typically occurs through suitable perturbations of the data, may determine a loss in the statistical utility of the data. In this paper, we consider privacy-protecting tests for goodness-of-fit in frequency tables, this being arguably the most common form of releasing data. Under the popular… Expand


Revisiting Differentially Private Hypothesis Tests for Categorical Data
A modified equivalence between chi-squared tests and likelihood ratio tests is shown, more suited to hypothesis testing with privacy, and differentially private likelihood ratio and chi-Squared tests for a variety of applications on tabular data are developed. Expand
Differentially Private Testing of Identity and Closeness of Discrete Distributions
The fundamental problems of identity testing (goodness of fit), and closeness testing (two sample test) of distributions over $k$ elements, under differential privacy are studied, and Le Cam's two point theorem is used to provide a general mechanism for proving lower bounds. Expand
The structure of optimal private tests for simple hypotheses
Hypothesis testing plays a central role in statistical inference, and is used in many settings where privacy concerns are paramount. This work answers a basic question about privately testing simpleExpand
Differentially Private Chi-Squared Hypothesis Testing: Goodness of Fit and Independence Testing
Hypothesis testing is a useful statistical tool in determining whether a given model should be rejected based on a sample from the population. Sample data may contain sensitive information aboutExpand
Calibrating Noise to Sensitivity in Private Data Analysis
The study is extended to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f, which is the amount that any single argument to f can change its output. Expand
Confidentiality and Differential Privacy in the Dissemination of Frequency Tables
For decades, national statistical agencies and other data custodians have been publishing frequency tables based on census, survey and administrative data. In order to protect the confidentiality ofExpand
Differential Privacy and the Risk-Utility Tradeoff for Multi-dimensional Contingency Tables
This paper explores how well the mechanism works in the context of a series of examples, and the extent to which the proposed differential-privacy mechanism allows for sensible inferences from the released data. Expand
Differentially Private Uniformly Most Powerful Tests for Binomial Data
This work derives uniformly most powerful (UMP) tests for simple and one-sided hypotheses for a population proportion within the framework of Differential Privacy (DP), optimizing finite sample performance and obtaining exact p-values, which are easily computed in terms of the Tulap random variable. Expand
Limiting privacy breaches in privacy preserving data mining
This paper presents a new formulation of privacy breaches, together with a methodology, "amplification", for limiting them, and instantiate this methodology for the problem of mining association rules, and modify the algorithm from [9] to limit privacy breaches without knowledge of the data distribution. Expand
The Algorithmic Foundations of Differential Privacy
The preponderance of this monograph is devoted to fundamental techniques for achieving differential privacy, and application of these techniques in creative combinations, using the query-release problem as an ongoing example. Expand