Corpus ID: 947901

# A Simple and Practical Algorithm for Differentially Private Data Release

@inproceedings{Hardt2012ASA,
title={A Simple and Practical Algorithm for Differentially Private Data Release},
author={Moritz Hardt and Katrina Ligett and Frank McSherry},
booktitle={NIPS},
year={2012}
}
• Published in NIPS 2012
• Computer Science
We present a new algorithm for differentially private data release, based on a simple combination of the Multiplicative Weights update rule with the Exponential Mechanism. Our MWEM algorithm achieves what are the best known and nearly optimal theoretical guarantees, while at the same time being simple to implement and experimentally more accurate on actual data sets than existing techniques.
361 Citations
Permute-and-Flip: A new mechanism for differentially private selection
• Computer Science
• NeurIPS
• 2020
This work proposes a new mechanism for differentially private selection based on a careful analysis of the privacy constraints, which can offer improvements up to a factor of two and runs in linear time. Expand
A minimax distortion view of differentially private query release
• Mathematics, Computer Science
• 2015 49th Asilomar Conference on Signals, Systems and Computers
• 2015
It is proved that the minimax distortion is O(1/n) as the database size n goes to infinity, with the squared-error distortion measure and fixed dimension of data entries, for the general class of statistical queries. Expand
Differentially Private Data Publishing and Analysis: A Survey
• Computer Science
• IEEE Transactions on Knowledge and Data Engineering
• 2017
This survey compares the diverse release mechanisms of differentially private data publishing given a variety of input data in terms of query type, the maximum number of queries, efficiency, and accuracy. Expand
Dual Query: Practical Private Query Release for High Dimensional Data
• Computer Science
• ICML
• 2014
The algorithm can efficiently and accurately answer millions of queries on the Netflix dataset, which has over 17;000 attributes; this is an improvement on the state of the art by multiple orders of magnitude. Expand
New Oracle-Efficient Algorithms for Private Synthetic Data Release
• Computer Science, Mathematics
• ICML
• 2020
Three new algorithms for constructing differentially private synthetic data are presented---a sanitized version of a sensitive dataset that approximately preserves the answers to a large collection of statistical queries that are computationally efficient when given access to an optimization oracle. Expand
Hardness of Non-Interactive Differential Privacy from One-Way Functions
• Computer Science
• IACR Cryptol. ePrint Arch.
• 2017
This work would like to design a differentially private algorithm that takes a dataset, consisting of some small number of elements n from some large data universe X, and efficiently outputs a summary that allows a user to efficiently obtain an answer to any query in some large family Q. Expand
Iterative Constructions and Private Data Release
• Computer Science
• TCC
• 2012
New algorithms (and new analyses of existing algorithms) in both the interactive and non-interactive settings are given, and a reduction based on the IDC framework shows that an efficient, private algorithm for computing sufficiently accurate rank-1 matrix approximations would lead to an improved efficient algorithm for releasing private synthetic data for graph cuts. Expand
twinify : A software package for differentially private data release
Differentially private (DP) data sharing has facilitated releasing data containing sensitive information in a privacy-preserving manner. Many of these data sharing approaches rely on generativeExpand
Subsampled Exponential Mechanism: Differential Privacy in Large Output Spaces
• Mathematics, Computer Science
• AISec@CCS
• 2015
This work presents the subsampled exponential mechanism, which scores only a sample of the outcomes, and shows that it still preserves differential privacy, and fulfills a similar accuracy bound. Expand
A Data- and Workload-Aware Algorithm for Range Queries Under Differential Privacy
• Computer Science
• ArXiv
• 2014
A new algorithm for answering a given set of range queries under $\epsilon$-differential privacy which often achieves substantially lower error than competing methods is described, and can achieve the benefits of data-dependence on both "easy" and "hard" databases. Expand

#### References

SHOWING 1-10 OF 31 REFERENCES
Efficient Batch Query Answering Under Differential Privacy
• Computer Science
• ArXiv
• 2011
This paper develops efficient algorithms for answering multiple queries under differential privacy with low error by advancing a recent approach called the matrix mechanism, which generalizes standard differentially private mechanisms. Expand
On the complexity of differentially private data release: efficient algorithms and hardness results
• Computer Science
• STOC '09
• 2009
Private data analysis in the setting in which a trusted and trustworthy curator releases to the public a "sanitization" of the data set that simultaneously protects the privacy of the individual contributors of data and offers utility to the data analyst is considered. Expand
• Computer Science
• Proc. VLDB Endow.
• 2012
An adaptive mechanism for answering sets of counting queries under differential privacy that approximates the optimal strategy for any workload of linear counting queries and achieves near-optimal error for many workloads. Expand
A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis
• Computer Science
• 2010 IEEE 51st Annual Symposium on Foundations of Computer Science
• 2010
A new differentially private multiplicative weights mechanism for answering a large number of interactive counting (or linear) queries that arrive online and may be adaptively chosen, and it is shown that when the input database is drawn from a smooth distribution — a distribution that does not place too much weight on any single data item — accuracy remains as above, and the running time becomes poly-logarithmic in the data universe size. Expand
Privacy-Preserving Datamining on Vertically Partitioned Databases
• Computer Science
• CRYPTO
• 2004
Under a rigorous definition of breach of privacy, Dinur and Nissim proved that unless the total number of queries is sub-linear in the size of the database, a substantial amount of noise is required to avoid a breach, rendering the database almost useless. Expand
The Median Mechanism: Interactive and Efficient Privacy with Multiple Queries
• Computer Science
• ArXiv
• 2009
The median mechanism is the first privacy mechanism capable of identifying and exploiting correlations among queries in an interactive setting, and an efficient implementation is given, with running time polynomial in the number of queries, the database size, and the domain size. Expand
Boosting the accuracy of differentially private histograms through consistency
• Computer Science
• Proc. VLDB Endow.
• 2010
It is shown that it is possible to significantly improve the accuracy of a general class of histogram queries while satisfying differential privacy, and that these techniques can be used for estimating the degree sequence of a graph very precisely, and for computing a histogram that can support arbitrary range queries accurately. Expand
PCPs and the Hardness of Generating Private Synthetic Data
• Mathematics, Computer Science
• TCC
• 2011
It is shown that there is no polynomial-time, differentially private algorithm A that takes a database D and outputs a "synthetic database" D all of whose two-way marginals are approximately equal to those of D. Expand
Mechanism Design via Differential Privacy
• Computer Science
• 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07)
• 2007
It is shown that the recent notion of differential privacv, in addition to its own intrinsic virtue, can ensure that participants have limited effect on the outcome of the mechanism, and as a consequence have limited incentive to lie. Expand
Measuring the achievable error of query sets under differential privacy
• Computer Science
• ArXiv
• 2012
A novel lower bound on the minimum total error required to simultaneously release answers to a set of workload queries is revealed, which reveals that the hardness of a query workload is related to the spectral properties of the workload when it is represented in matrix form. Expand