# A Simple and Practical Algorithm for Differentially Private Data Release

@inproceedings{Hardt2012ASA, title={A Simple and Practical Algorithm for Differentially Private Data Release}, author={Moritz Hardt and Katrina Ligett and Frank McSherry}, booktitle={NIPS}, year={2012} }

We present a new algorithm for differentially private data release, based on a simple combination of the Multiplicative Weights update rule with the Exponential Mechanism. Our MWEM algorithm achieves what are the best known and nearly optimal theoretical guarantees, while at the same time being simple to implement and experimentally more accurate on actual data sets than existing techniques.

#### Paper Mentions

#### 361 Citations

Permute-and-Flip: A new mechanism for differentially private selection

- Computer Science
- NeurIPS
- 2020

This work proposes a new mechanism for differentially private selection based on a careful analysis of the privacy constraints, which can offer improvements up to a factor of two and runs in linear time. Expand

A minimax distortion view of differentially private query release

- Mathematics, Computer Science
- 2015 49th Asilomar Conference on Signals, Systems and Computers
- 2015

It is proved that the minimax distortion is O(1/n) as the database size n goes to infinity, with the squared-error distortion measure and fixed dimension of data entries, for the general class of statistical queries. Expand

Differentially Private Data Publishing and Analysis: A Survey

- Computer Science
- IEEE Transactions on Knowledge and Data Engineering
- 2017

This survey compares the diverse release mechanisms of differentially private data publishing given a variety of input data in terms of query type, the maximum number of queries, efficiency, and accuracy. Expand

Dual Query: Practical Private Query Release for High Dimensional Data

- Computer Science
- ICML
- 2014

The algorithm can efficiently and accurately answer millions of queries on the Netflix dataset, which has over 17;000 attributes; this is an improvement on the state of the art by multiple orders of magnitude. Expand

New Oracle-Efficient Algorithms for Private Synthetic Data Release

- Computer Science, Mathematics
- ICML
- 2020

Three new algorithms for constructing differentially private synthetic data are presented---a sanitized version of a sensitive dataset that approximately preserves the answers to a large collection of statistical queries that are computationally efficient when given access to an optimization oracle. Expand

Hardness of Non-Interactive Differential Privacy from One-Way Functions

- Computer Science
- IACR Cryptol. ePrint Arch.
- 2017

This work would like to design a differentially private algorithm that takes a dataset, consisting of some small number of elements n from some large data universe X, and efficiently outputs a summary that allows a user to efficiently obtain an answer to any query in some large family Q. Expand

Iterative Constructions and Private Data Release

- Computer Science
- TCC
- 2012

New algorithms (and new analyses of existing algorithms) in both the interactive and non-interactive settings are given, and a reduction based on the IDC framework shows that an efficient, private algorithm for computing sufficiently accurate rank-1 matrix approximations would lead to an improved efficient algorithm for releasing private synthetic data for graph cuts. Expand

twinify : A software package for differentially private data release

- 2020

Differentially private (DP) data sharing has facilitated releasing data containing sensitive information in a privacy-preserving manner. Many of these data sharing approaches rely on generative… Expand

Subsampled Exponential Mechanism: Differential Privacy in Large Output Spaces

- Mathematics, Computer Science
- AISec@CCS
- 2015

This work presents the subsampled exponential mechanism, which scores only a sample of the outcomes, and shows that it still preserves differential privacy, and fulfills a similar accuracy bound. Expand

A Data- and Workload-Aware Algorithm for Range Queries Under Differential Privacy

- Computer Science
- ArXiv
- 2014

A new algorithm for answering a given set of range queries under $\epsilon$-differential privacy which often achieves substantially lower error than competing methods is described, and can achieve the benefits of data-dependence on both "easy" and "hard" databases. Expand

#### References

SHOWING 1-10 OF 31 REFERENCES

Efficient Batch Query Answering Under Differential Privacy

- Computer Science
- ArXiv
- 2011

This paper develops efficient algorithms for answering multiple queries under differential privacy with low error by advancing a recent approach called the matrix mechanism, which generalizes standard differentially private mechanisms. Expand

On the complexity of differentially private data release: efficient algorithms and hardness results

- Computer Science
- STOC '09
- 2009

Private data analysis in the setting in which a trusted and trustworthy curator releases to the public a "sanitization" of the data set that simultaneously protects the privacy of the individual contributors of data and offers utility to the data analyst is considered. Expand

An Adaptive Mechanism for Accurate Query Answering under Differential Privacy

- Computer Science
- Proc. VLDB Endow.
- 2012

An adaptive mechanism for answering sets of counting queries under differential privacy that approximates the optimal strategy for any workload of linear counting queries and achieves near-optimal error for many workloads. Expand

A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis

- Computer Science
- 2010 IEEE 51st Annual Symposium on Foundations of Computer Science
- 2010

A new differentially private multiplicative weights mechanism for answering a large number of interactive counting (or linear) queries that arrive online and may be adaptively chosen, and it is shown that when the input database is drawn from a smooth distribution — a distribution that does not place too much weight on any single data item — accuracy remains as above, and the running time becomes poly-logarithmic in the data universe size. Expand

Privacy-Preserving Datamining on Vertically Partitioned Databases

- Computer Science
- CRYPTO
- 2004

Under a rigorous definition of breach of privacy, Dinur and Nissim proved that unless the total number of queries is sub-linear in the size of the database, a substantial amount of noise is required to avoid a breach, rendering the database almost useless. Expand

The Median Mechanism: Interactive and Efficient Privacy with Multiple Queries

- Computer Science
- ArXiv
- 2009

The median mechanism is the first privacy mechanism capable of identifying and exploiting correlations among queries in an interactive setting, and an efficient implementation is given, with running time polynomial in the number of queries, the database size, and the domain size. Expand

Boosting the accuracy of differentially private histograms through consistency

- Computer Science
- Proc. VLDB Endow.
- 2010

It is shown that it is possible to significantly improve the accuracy of a general class of histogram queries while satisfying differential privacy, and that these techniques can be used for estimating the degree sequence of a graph very precisely, and for computing a histogram that can support arbitrary range queries accurately. Expand

PCPs and the Hardness of Generating Private Synthetic Data

- Mathematics, Computer Science
- TCC
- 2011

It is shown that there is no polynomial-time, differentially private algorithm A that takes a database D and outputs a "synthetic database" D all of whose two-way marginals are approximately equal to those of D. Expand

Mechanism Design via Differential Privacy

- Computer Science
- 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07)
- 2007

It is shown that the recent notion of differential privacv, in addition to its own intrinsic virtue, can ensure that participants have limited effect on the outcome of the mechanism, and as a consequence have limited incentive to lie. Expand

Measuring the achievable error of query sets under differential privacy

- Computer Science
- ArXiv
- 2012

A novel lower bound on the minimum total error required to simultaneously release answers to a set of workload queries is revealed, which reveals that the hardness of a query workload is related to the spectral properties of the workload when it is represented in matrix form. Expand