• Corpus ID: 7409485

Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo

@inproceedings{Wang2015PrivacyFF,
  title={Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo},
  author={Yu-Xiang Wang and Stephen E. Fienberg and Alex Smola},
  booktitle={ICML},
  year={2015}
}
We consider the problem of Bayesian learning on sensitive datasets and present two simple but somewhat surprising results that connect Bayesian learning to "differential privacy", a cryptographic approach to protect individual-level privacy while permitting database-level utility. Specifically, we show that under standard assumptions, getting one sample from a posterior distribution is differentially private "for free"; and this sample as a statistical estimator is often consistent, near… 

Figures from this paper

Towards More Practical Stochastic Gradient MCMC in Differential Privacy
TLDR
Stochastic gradient Markov chain Monte Carlo (SG-MCMC) – a class of scalable Bayesian posterior sampling algorithms – satisfies strong differential privacy, when carefully chosen stepsizes are employed.
Bayesian Differential Privacy through Posterior Sampling
TLDR
The answer is affirmative: under certain conditions on the prior, sampling from the posterior distribution can be used to achieve a desired level of privacy and utility, and bounds on the sensitivity of the posterior to the data are proved, which gives a measure of robustness.
Differential Privacy for Bayesian Inference through Posterior Sampling
TLDR
This work defines differential privacy over arbitrary data set metrics, outcome spaces and distribution families, and proves bounds on the sensitivity of the posterior to the data, which delivers a measure of robustness.
On the Differential Privacy of Bayesian Inference
TLDR
This work studies how to communicate findings of Bayesian inference to third parties, while preserving the strong guarantee of differential privacy, with a novel focus on the influence of graph structure on privacy.
Differential Privacy in a Bayesian setting through posterior sampling
TLDR
Borders on the robustness of the posterior are proved, a posterior sampling mechanism is introduced, it is shown that it is differentially private and finite sample bounds for distinguishability-based privacy under a strong adversarial model are provided.
Statistic Selection and MCMC for Differentially Private Bayesian Estimation
TLDR
It is demonstrated that, the relative performance of a statistic, in terms of the mean squared error of the Bayesian estimator based on the corresponding privatized statistic, is adequately predicted by the Fisher information of the privatization statistic.
Privacy-Preserving Parametric Inference: A Case for Robust Statistics
TLDR
It is demonstrated that differential privacy is a weaker stability requirement than infinitesimal robustness, and it is shown that robust M-estimators can be easily randomized to guarantee both differential privacy and robustness toward the presence of contaminated data.
On Connecting Stochastic Gradient MCMC and Differential Privacy
TLDR
It is shown that stochastic gradient Markov chain Monte Carlo (SG-MCMC) -- a class of scalable Bayesian posterior sampling algorithms proposed recently -- satisfies strong differential privacy with carefully chosen step sizes.
A New Bound for Privacy Loss from Bayesian Posterior Sampling
TLDR
The privacy loss quantified by the new bound is applied to release differentially private synthetic data from Bayesian models in several experiments and the improved utility of the synthetic data is shown compared to those generated from explicitly designed randomization mechanisms that privatize posterior distributions.
Exact MCMC with differentially private moves
TLDR
The penalty algorithm of Ceperley and Dewing (J Chem Phys 110(20):9812–9820, 1999), a Markov chain Monte Carlo algorithm for Bayesian inference, is viewed in the context of data privacy and advocate its use for data privacy.
...
...

References

SHOWING 1-10 OF 70 REFERENCES
Robust and Private Bayesian Inference
TLDR
Borders on the robustness of the posterior are proved, a posterior sampling mechanism is introduced, it is shown that it is differentially private and finite sample bounds for distinguishability-based privacy under a strong adversarial model are provided.
Probabilistic Inference and Differential Privacy
TLDR
It is found that probabilistic inference can improve accuracy, integrate multiple observations, measure uncertainty, and even provide posterior distributions over quantities that were not directly measured.
Efficient, Differentially Private Point Estimators
TLDR
It is shown that for a large class of parametric probability models, one can construct a differentially private estimator whose distribution converges to that of the maximum likelihood estimator, which is efficient and asymptotically unbiased.
Bayesian inference under differential privacy
TLDR
Theoretical and experimental analysis are shown to demonstrate the efficiency and effectiveness of both inference mechanism and online query-answering system.
On the 'Semantics' of Differential Privacy: A Bayesian Formulation
TLDR
This paper provides a precise formulation of differential privacy guarantees in terms of the inferences drawn by a Bayesian adversary, and shows that this formulation is satisfied by both "vanilla" differential privacy as well as a relaxation known as (epsilon,delta)-differential privacy.
Private Convex Empirical Risk Minimization and High-dimensional Regression
TLDR
This work significantly extends the analysis of the “objective perturbation” algorithm of Chaudhuri et al. (2011) for convex ERM problems, and gives the best known algorithms for differentially private linear regression.
Private Convex Optimization for Empirical Risk Minimization with Applications to High-dimensional Regression
TLDR
This work significantly extends the analysis of the “objective perturbation” algorithm of Chaudhuri et al. (2011) for convex ERM problems, and gives the best known algorithms for differentially private linear regression.
Bayesian posterior sampling via stochastic gradient Fisher scoring Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring
TLDR
By leverag-ing the Bayesian Central Limit Theorem, the SGLD algorithm is ex-tend so that at high mixing rates it will sample from a normal approximation of the posterior, while for slow mixing rates the algorithm will mimic the behavior of S GLD with a pre-conditioner matrix.
Stochastic gradient descent with differentially private updates
TLDR
This paper derives differentially private versions of stochastic gradient descent, and test them empirically to show that standard SGD experiences high variability due to differential privacy, but a moderate increase in the batch size can improve performance significantly.
Differentially Private Empirical Risk Minimization
TLDR
This work proposes a new method, objective perturbation, for privacy-preserving machine learning algorithm design, and shows that both theoretically and empirically, this method is superior to the previous state-of-the-art, output perturbations, in managing the inherent tradeoff between privacy and learning performance.
...
...