Privately releasing conjunctions and the statistical query barrier

@inproceedings{Gupta2011PrivatelyRC,
  title={Privately releasing conjunctions and the statistical query barrier},
  author={Anupam Gupta and Moritz Hardt and Aaron Roth and Jonathan Ullman},
  booktitle={STOC '11},
  year={2011}
}
Suppose we would like to know all answers to a set of statistical queries C on a data set up to small error, but we can only access the data itself using statistical queries. A trivial solution is to exhaustively ask all queries in C. Can we do any better? We show that the number of statistical queries necessary and sufficient for this task is---up to polynomial factors---equal to the agnostic learning complexity of C in Kearns' statistical query (SQ)model. This gives a complete answer to the… 
A learning theory approach to noninteractive database privacy
TLDR
It is shown that, ignoring computational constraints, it is possible to release synthetic databases that are useful for accurately answering large classes of queries while preserving differential privacy and a relaxation of the utility guarantee is given.
Faster private release of marginals on small databases
TLDR
To the best of the knowledge, this is the first algorithm capable of privately answering marginal queries with a non-trivial worst-case accuracy guarantee for databases containing poly(d, k) records in time exp(o(d)).
Fast Private Data Release Algorithms for Sparse Queries
TLDR
This paper considers the large class of sparse queries, which take non-zero values on only polynomially many universe elements, and gives efficient query release algorithms for this class, in both the interactive and the non-interactive setting.
Answering n{2+o(1)} counting queries with differential privacy is hard
TLDR
It is proved that if one-way functions exist, then there is no algorithm that takes as input a database db ∈ dbset, and k = ~Ω(n2) arbitrary efficiently computable counting queries, runs in time poly(d, n), and returns an approximate answer to each query, while satisfying differential privacy.
Differentially Private Data Releasing for Smooth Queries
TLDR
This work develops two e-differentially private mechanisms which are able to answer all smooth queries for continuous data with continuous function based on L∞-approximation of (transformed) smooth functions by low-degree even trigonometric polynomials with uniformly bounded coefficients.
Faster Algorithms for Privately Releasing Marginals
TLDR
To the knowledge, this work is the first algorithm capable of privately releasing marginal queries with non-trivial worst-case accuracy guarantees in time substantially smaller than the number of k-way marginal queries, which is dΘ(k) (for k≪d).
Private data release via learning thresholds
TLDR
The task of analyzing a database containing sensitive information about individual participants is studied, and a computationally efficient reduction from differentially private data release for a class of counting queries, to learning thresholded sums of predicates from a related class is instantiated.
Differentially Private Query Release Through Adaptive Projection
TLDR
A new algorithm for releasing answers to very large numbers of statistical queries like k-way marginals, subject to differential privacy, makes adaptive use of a continuous relaxation of the Projection Mechanism, and outperforms existing algorithms on large query classes.
Computational Bounds on Statistical Query Learning
TLDR
This work gives the first strictly computational upper and lower bounds on the complexity of several types of learning in the statistical query (SQ) learning model, and proves that the distribution-specific lower bound is essentially tight.
Exploiting Metric Structure for Efficient Private Query Release
TLDR
This work gives simple, computationally efficient algorithms for answering distance queries defined over an arbitrary metric, one of the first subclasses of linear queries for which efficient algorithms are known for the private query release problem, circumventing known hardness results for generic linear queries.
...
...

References

SHOWING 1-10 OF 42 REFERENCES
A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis
TLDR
A new differentially private multiplicative weights mechanism for answering a large number of interactive counting (or linear) queries that arrive online and may be adaptively chosen, and it is shown that when the input database is drawn from a smooth distribution — a distribution that does not place too much weight on any single data item — accuracy remains as above, and the running time becomes poly-logarithmic in the data universe size.
Practical privacy: the SuLQ framework
TLDR
This work considers a statistical database in which a trusted administrator introduces noise to the query responses with the goal of maintaining privacy of individual database entries, and modify the privacy analysis to real-valued functions f and arbitrary row types, greatly improving the bounds on noise required for privacy.
Interactive privacy via the median mechanism
TLDR
The median mechanism is the first privacy mechanism capable of identifying and exploiting correlations among queries in an interactive setting, and an efficient implementation is given, with running time polynomial in the number of queries, the database size, and the domain size.
Weakly learning DNF and characterizing statistical query learning using Fourier analysis
TLDR
It is proved that an algorithm due to Kushilevitz and Mansour can be used to weakly learn DNF using membership queries in polynomial time, with respect to the uniform distribution on the inputs, and it is obtained that DNF expressions and decision trees are not evenWeakly learnable with any unproven assumptions.
Lower Bounds in Differential Privacy
TLDR
This paper combines the techniques of Hardt and Talwar [11] and McGregor et al.
Efficient noise-tolerant learning from statistical queries
TLDR
This paper formalizes a new but related model of learning from statistical queries, and demonstrates the generality of the statistical query model, showing that practically every class learnable in Valiant’s model and its variants can also be learned in the new model (and thus can be learning in the presence of noise).
On the geometry of differential privacy
TLDR
The lower bound is strong enough to separate the concept of differential privacy from the notion of approximate differential privacy where an upper bound of O(√{d}/ε) can be achieved.
On the complexity of differentially private data release: efficient algorithms and hardness results
TLDR
Private data analysis in the setting in which a trusted and trustworthy curator releases to the public a "sanitization" of the data set that simultaneously protects the privacy of the individual contributors of data and offers utility to the data analyst is considered.
Calibrating Noise to Sensitivity in Private Data Analysis
TLDR
The study is extended to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f, which is the amount that any single argument to f can change its output.
The price of privately releasing contingency tables and the spectra of random matrices with correlated rows
TLDR
Lower bounds on how much distortion is necessary in marginal tables to ensure the privacy of sensitive data are derived and stronger results are obtained for mechanisms that add instance-independent noise and weaker results when k is super-constant.
...
...