Answering n{2+o(1)} counting queries with differential privacy is hard

@article{Ullman2013AnsweringNC,
  title={Answering n\{2+o(1)\} counting queries with differential privacy is hard},
  author={Jonathan Ullman},
  journal={ArXiv},
  year={2013},
  volume={abs/1207.6945}
}
A central problem in differentially private data analysis is how to design efficient algorithms capable of answering large numbers of counting queries on a sensitive database. Counting queries are of the form "What fraction of individual records in the database satisfy the property q?" We prove that if one-way functions exist, then there is no algorithm that takes as input a database db ∈ dbset, and k = ~Θ(n2) arbitrary efficiently computable counting queries, runs in time poly(d, n), and… 
Answering n{2+o(1)} counting queries with differential privacy is hard
TLDR
It is proved that if one-way functions exist, then there is no algorithm that takes as input a database db ∈ dbset, and k = ~Ω(n2) arbitrary efficiently computable counting queries, runs in time poly(d, n), and returns an approximate answer to each query, while satisfying differential privacy.
Fingerprinting codes and the price of approximate differential privacy
TLDR
The results rely on the existence of short fingerprinting codes (Boneh and Shaw, CRYPTO'95; Tardos, STOC'03), which are closely connected to the sample complexity of differentially private data release.
Faster private release of marginals on small databases
TLDR
To the best of the knowledge, this is the first algorithm capable of privately answering marginal queries with a non-trivial worst-case accuracy guarantee for databases containing poly(d, k) records in time exp(o(d)).
On the measurement complexity of differentially private query answering
TLDR
This work explores the hardness for differentially private query answering mechanisms beyond the stateless restriction and proves unconditional subexponential lower bound for the measurement complexity of the class of sanitizer with ɛ-differential privacy.
Strong Hardness of Privacy from Weak Traitor Tracing
TLDR
The hardness result for a polynomial size query set resp.
Efficient Algorithm for Privately Releasing Smooth Queries
TLDR
An e-differentially private mechanism which for the class of K-smooth queries has accuracy O (N-K-2d+K/epsilon) and is based on L∞-approximation of (transformed) smooth functions by low degree even trigonometric polynomials with small and efficiently computable coefficients.
Differentially Private Data Releasing for Smooth Queries
TLDR
This work develops two e-differentially private mechanisms which are able to answer all smooth queries for continuous data with continuous function based on L∞-approximation of (transformed) smooth functions by low-degree even trigonometric polynomials with uniformly bounded coefficients.
Interactive fingerprinting codes and the hardness of preventing false discovery
TLDR
It is shown that, under a standard hardness assumption, there is no computationally efficient algorithm that, given n samples from an unknown distribution, can give valid answers to O(n2) adaptively chosen statistical queries.
Preventing False Discovery in Interactive Data Analysis Is Hard
We show that, under a standard hardness assumption, there is no computationally efficient algorithm that given n samples from an unknown distribution can give valid answers to n3+o(1) adaptively
Efficient Private Query Release via Polynomial Approximation
TLDR
It is shown that there exists a computationally efficient $\varepsilon$-differentially private mechanism that releases a query class parametrized by additively separable Holder continuous functions, and that the accuracy can be significantly boosted.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 29 REFERENCES
Answering n{2+o(1)} counting queries with differential privacy is hard
TLDR
It is proved that if one-way functions exist, then there is no algorithm that takes as input a database db ∈ dbset, and k = ~Ω(n2) arbitrary efficiently computable counting queries, runs in time poly(d, n), and returns an approximate answer to each query, while satisfying differential privacy.
Privately releasing conjunctions and the statistical query barrier
TLDR
The number of statistical queries necessary and sufficient for this task is equal to the agnostic learning complexity of C in Kearns' statistical query (SQ)model, which isolates the complexity of agnosticLearning in the SQ-model as a new barrier in the design of differentially private algorithms.
A learning theory approach to noninteractive database privacy
TLDR
It is shown that, ignoring computational constraints, it is possible to release synthetic databases that are useful for accurately answering large classes of queries while preserving differential privacy and a relaxation of the utility guarantee is given.
Faster Algorithms for Privately Releasing Marginals
TLDR
To the knowledge, this work is the first algorithm capable of privately releasing marginal queries with non-trivial worst-case accuracy guarantees in time substantially smaller than the number of k-way marginal queries, which is dΘ(k) (for k≪d).
A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis
TLDR
A new differentially private multiplicative weights mechanism for answering a large number of interactive counting (or linear) queries that arrive online and may be adaptively chosen, and it is shown that when the input database is drawn from a smooth distribution — a distribution that does not place too much weight on any single data item — accuracy remains as above, and the running time becomes poly-logarithmic in the data universe size.
Interactive privacy via the median mechanism
TLDR
The median mechanism is the first privacy mechanism capable of identifying and exploiting correlations among queries in an interactive setting, and an efficient implementation is given, with running time polynomial in the number of queries, the database size, and the domain size.
Private data release via learning thresholds
TLDR
The task of analyzing a database containing sensitive information about individual participants is studied, and a computationally efficient reduction from differentially private data release for a class of counting queries, to learning thresholded sums of predicates from a related class is instantiated.
PCPs and the Hardness of Generating Private Synthetic Data
TLDR
It is shown that there is no polynomial-time, differentially private algorithm A that takes a database D and outputs a "synthetic database" D all of whose two-way marginals are approximately equal to those of D.
Practical privacy: the SuLQ framework
TLDR
This work considers a statistical database in which a trusted administrator introduces noise to the query responses with the goal of maintaining privacy of individual database entries, and modify the privacy analysis to real-valued functions f and arbitrary row types, greatly improving the bounds on noise required for privacy.
Iterative Constructions and Private Data Release
TLDR
New algorithms (and new analyses of existing algorithms) in both the interactive and non-interactive settings are given, and a reduction based on the IDC framework shows that an efficient, private algorithm for computing sufficiently accurate rank-1 matrix approximations would lead to an improved efficient algorithm for releasing private synthetic data for graph cuts.
...
1
2
3
...