# PCPs and the Hardness of Generating Private Synthetic Data

@inproceedings{Ullman2011PCPsAT, title={PCPs and the Hardness of Generating Private Synthetic Data}, author={Jonathan Ullman and Salil P. Vadhan}, booktitle={TCC}, year={2011} }

Assuming the existence of one-way functions, we show that there is no polynomial-time, differentially private algorithm A that takes a database D ∈ ({0, 1}d)n and outputs a "synthetic database" D all of whose two-way marginals are approximately equal to those of D. (A two-way marginal is the fraction of database rows x ∈ {0, 1}d with a given pair of values in a given pair of columns). This answers a question of Barak et al. (PODS '07), who gave an algorithm running in time poly(n, 2d).
Our…

## 84 Citations

### Faster Algorithms for Privately Releasing Marginals

- Computer ScienceICALP
- 2012

To the knowledge, this work is the first algorithm capable of privately releasing marginal queries with non-trivial worst-case accuracy guarantees in time substantially smaller than the number of k-way marginal queries, which is dΘ(k) (for k≪d).

### Faster private release of marginals on small databases

- Computer Science, MathematicsITCS
- 2014

To the best of the knowledge, this is the first algorithm capable of privately answering marginal queries with a non-trivial worst-case accuracy guarantee for databases containing poly(d, k) records in time exp(o(d)).

### Fingerprinting codes and the price of approximate differential privacy

- Computer ScienceSTOC
- 2014

The results rely on the existence of short fingerprinting codes (Boneh and Shaw, CRYPTO'95; Tardos, STOC'03), which are closely connected to the sample complexity of differentially private data release.

### Using Convex Relaxations for Efficiently and Privately Releasing Marginals

- Computer ScienceSoCG
- 2014

This work presents a polynomial time algorithm that matches the best known information-theoretic bounds when k = 2 and achieves average error at most Õ(√nd[k/2]/4), an improvement over previous work on when k is small and when error o(n) is desirable.

### Faster Algorithms for Privately Releasing Marginals Please share how this access benefits you. Your story matters

- Computer Science
- 2012

To the knowledge, this work gives an algorithm that runs in time d O ( √ k ) and releases a private summary capable of answering any k -way marginal query with at most ± .

### Differentially Private Data Releasing for Smooth Queries with Synthetic Database Output

- Computer ScienceArXiv
- 2014

This work develops an $\epsilon$-differentially private mechanism for the class of $K$-smooth queries that outputs a synthetic database and achieves an accuracy of $O (n^{-\frac{K}{2d+K}}/\ep silon )$, and runs in polynomial time.

### Answering n{2+o(1)} counting queries with differential privacy is hard

- Computer Science, MathematicsSTOC '13
- 2013

It is proved that if one-way functions exist, then there is no algorithm that takes as input a database db ∈ dbset, and k = ~Ω(n2) arbitrary efficiently computable counting queries, runs in time poly(d, n), and returns an approximate answer to each query, while satisfying differential privacy.

### A learning theory approach to noninteractive database privacy

- Computer ScienceJACM
- 2013

It is shown that, ignoring computational constraints, it is possible to release synthetic databases that are useful for accurately answering large classes of queries while preserving differential privacy and a relaxation of the utility guarantee is given.

### Strong Hardness of Privacy from Weak Traitor Tracing

- Computer Science, MathematicsTCC
- 2016

The hardness result for a polynomial size query set resp.

### New Oracle-Efficient Algorithms for Private Synthetic Data Release

- Computer ScienceICML
- 2020

Three new algorithms for constructing differentially private synthetic data are presented---a sanitized version of a sensitive dataset that approximately preserves the answers to a large collection of statistical queries that are computationally efficient when given access to an optimization oracle.

## References

SHOWING 1-10 OF 40 REFERENCES

### Practical privacy: the SuLQ framework

- Computer SciencePODS '05
- 2005

This work considers a statistical database in which a trusted administrator introduces noise to the query responses with the goal of maintaining privacy of individual database entries, and modify the privacy analysis to real-valued functions f and arbitrary row types, greatly improving the bounds on noise required for privacy.

### Calibrating Noise to Sensitivity in Private Data Analysis

- Computer ScienceTCC
- 2006

The study is extended to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f, which is the amount that any single argument to f can change its output.

### Interactive privacy via the median mechanism

- Computer ScienceSTOC '10
- 2010

The median mechanism is the first privacy mechanism capable of identifying and exploiting correlations among queries in an interactive setting, and an efficient implementation is given, with running time polynomial in the number of queries, the database size, and the domain size.

### Cryptographic limitations on learning Boolean formulae and finite automata

- Computer Science, MathematicsJACM
- 1994

It is proved that a polynomial-time learning algorithm for Boolean formulae, deterministic finite automata or constant-depth threshold circuits would have dramatic consequences for cryptography and number theory and is applied to obtain strong intractability results for approximating a generalization of graph coloring.

### The complexity of properly learning simple concept classes

- Computer Science, MathematicsJ. Comput. Syst. Sci.
- 2008

### Differential Privacy

- Computer ScienceEncyclopedia of Cryptography and Security
- 2006

A general impossibility result is given showing that a formalization of Dalenius' goal along the lines of semantic security cannot be achieved, which suggests a new measure, differential privacy, which, intuitively, captures the increased risk to one's privacy incurred by participating in a database.

### Boosting and Differential Privacy

- Computer Science2010 IEEE 51st Annual Symposium on Foundations of Computer Science
- 2010

This work obtains an $O(\eps^2) bound on the {\em expected} privacy loss from a single $\eps$-\dfp{} mechanism, and gets stronger bounds on the expected cumulative privacy loss due to multiple mechanisms, each of which provides $\eps-differential privacy or one of its relaxations, and each ofWhich operates on (potentially) different, adaptively chosen, databases.

### Computationally Sound Proofs

- Computer Science, MathematicsSIAM J. Comput.
- 2000

If a special type of computationally sound proof exists, it is shown that Blum's notion of program checking can be meaningfully broadened so as to prove that $\cal N \cal P$-complete languages are checkable.

### Universal arguments and their applications

- Computer Science, MathematicsProceedings 17th IEEE Annual Conference on Computational Complexity
- 2002

It is shown that universal-arguments can be constructed based on standard intractability assumptions that refer to polynomial-size circuits (rather than assumptions referring to subexponential- size circuits as used in the construction of CS-proofs).

### Privacy-Preserving Datamining on Vertically Partitioned Databases

- Computer ScienceCRYPTO
- 2004

Under a rigorous definition of breach of privacy, Dinur and Nissim proved that unless the total number of queries is sub-linear in the size of the database, a substantial amount of noise is required to avoid a breach, rendering the database almost useless.