Differential Privacy for Statistics: What we Know and What we Want to Learn

@article{Dwork2010DifferentialPF,
  title={Differential Privacy for Statistics: What we Know and What we Want to Learn},
  author={Cynthia Dwork and Adam D. Smith},
  journal={J. Priv. Confidentiality},
  year={2010},
  volume={1}
}
We motivate and review the denition privacy, our principal motivating scenario was a statistical database, in which the trusted and trustworthy curator (in our minds, the Census Bureau) gathers sensitive information from a large number of respondents (the sample), with the goal of learning and releasing to the public statistical facts about the underlying population. The diculty, of course, is to release statistical information without compromising the privacy of the individual respondents. We… Expand
A framework for privacy preserving statistical analysis on distributed databases
  • B. Lin, Ye Wang, S. Rane
  • Computer Science
  • 2012 IEEE International Workshop on Information Forensics and Security (WIFS)
  • 2012
TLDR
This work is an attempt toward providing a theoretical formulation of privacy and utility for problems of this type, and a constructive scheme based on randomized response is presented as an example mechanism that satisfies the formulated privacy requirements. Expand
On the Benefits of Sampling in Privacy Preserving Statistical Analysis on Distributed Databases
TLDR
This work describes a data release mechanism that employs Post Randomization (PRAM), encryption and random sampling to maintain privacy, while allowing the authorized party to conduct an accurate statistical analysis of the data. Expand
Differential Privacy for Social Science Inference
TLDR
A secure curator interface is detailed, by which researchers can have access to privatized statistical results from their queries without gaining any access to the underlying raw data, and differential privacy and the construction of differentially private summary statistics are introduced. Expand
No free lunch in data privacy
TLDR
This paper argues that privacy of an individual is preserved when it is possible to limit the inference of an attacker about the participation of the individual in the data generating process, different from limiting the inference about the presence of a tuple. Expand
Investigating Statistical Privacy Frameworks from the Perspective of Hypothesis Testing
TLDR
Findings show that an adversary’s auxiliary information - in the form of prior distribution of the database and correlation across records and time - indeed influences the proper choice of ɛ, and can provide useful insights on the relationships among a broad range of privacy frameworks. Expand
Differential Privacy and Machine Learning: a Survey and Review
TLDR
This paper explores the interplay between machine learning and differential privacy, namely privacy-preserving machine learning algorithms and learning-based data release mechanisms, and describes some theoretical results that address what can be learned differentially privately and upper bounds of loss functions for differentially private algorithms. Expand
A Theory of Privacy and Utility in Databases
TLDR
This paper presents the first information-theoretic approach that promises an analytical model guaranteeing tight bounds on how much utility is possible for a given level of privacy and vice-versa. Expand
Asymptotically Optimal and Private Statistical Estimation
TLDR
This talk discusses two differentially private estimators that, given i.i.d. samples from a probability distribution, converge to the correct answer at the same rate as the optimal nonprivate estimator. Expand
Issues in Data Privacy
Managing a data set with sensitive but useful information, such as medical records, requires reconciling two objectives: providing utility to others and respecting the privacy of individuals whoExpand
Privacy and Statistical Risk: Formalisms and Minimax Bounds
TLDR
This work explores and compares a variety of definitions for privacy and disclosure limitation in statistical estimation and data analysis, and provides minimax risk bounds for several estimation problems, including mean estimation, estimation of the support of a distribution, and nonparametric density estimation. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 77 REFERENCES
Toward Privacy in Public Databases
TLDR
An important contribution of this work is a definition of privacy (and privacy compromise) for statistical databases, together with a method for describing and comparing the privacy offered by specific sanitization techniques. Expand
On the Difficulties of Disclosure Prevention in Statistical Databases or The Case for Differential Privacy
TLDR
A general impossibility result is given showing that a natural formalization of Dalenius’ goal cannot be achieved if the database is useful, and a variant of the result threatens the privacy even of someone not in the database. Expand
Differential Privacy
TLDR
A general impossibility result is given showing that a formalization of Dalenius' goal along the lines of semantic security cannot be achieved, which suggests a new measure, differential privacy, which, intuitively, captures the increased risk to one's privacy incurred by participating in a database. Expand
New Efficient Attacks on Statistical Disclosure Control Mechanisms
TLDR
The Dinur-Nissim style results are strong because they demonstrate insecurity of all low-distortion privacy mechanisms, and a more acute attack, requiring only a fixed number of queries for each bit revealed. Expand
Smooth sensitivity and sampling in private data analysis
TLDR
This is the first formal analysis of the effect of instance-based noise in the context of data privacy, and shows how to do this efficiently for several different functions, including the median and the cost of the minimum spanning tree. Expand
Calibrating Noise to Sensitivity in Private Data Analysis
TLDR
The study is extended to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f, which is the amount that any single argument to f can change its output. Expand
A Statistical Framework for Differential Privacy
One goal of statistical privacy research is to construct a data release mechanism that protects individual privacy while preserving information content. An example is a random mechanism that takes anExpand
Tabular data protection DIFFERENTIALLY PRIVATE MARGINALS RELEASE WITH MUTUAL CONSISTENCY AND ERROR INDEPENDENT OF SAMPLE SIZE
We report on a result of Barak et al. on a privacy-preserving technology for release of mutually consistent multi-way marginals [1]. The result ensures differential privacy, a mathematically rigorousExpand
Practical privacy: the SuLQ framework
TLDR
This work considers a statistical database in which a trusted administrator introduces noise to the query responses with the goal of maintaining privacy of individual database entries, and modify the privacy analysis to real-valued functions f and arbitrary row types, greatly improving the bounds on noise required for privacy. Expand
What Can We Learn Privately?
TLDR
This work investigates learning algorithms that satisfy differential privacy, a notion that provides strong confidentiality guarantees in the contexts where aggregate information is released about a database containing sensitive information about individuals. Expand
...
1
2
3
4
5
...