• Corpus ID: 5765263

Private Posterior distributions from Variational approximations

  title={Private Posterior distributions from Variational approximations},
  author={Vishesh Karwa and Daniel Kifer and Aleksandra B. Slavkovic},
Privacy preserving mechanisms such as differential privacy inject additional randomness in the form of noise in the data, beyond the sampling mechanism. Ignoring this additional noise can lead to inaccurate and invalid inferences. In this paper, we incorporate the privacy mechanism explicitly into the likelihood function by treating the original data as missing, with an end goal of estimating posterior distributions over model parameters. This leads to a principled way of performing valid… 

Figures and Tables from this paper

Data Augmentation MCMC for Bayesian Inference from Privatized Data
This work proposes an MCMC framework to perform Bayesian inference from the privatized data, which is applicable to a wide range of statistical models and privacy mechanisms and illustrates the efficacy and applicability of the methods on a naïve-Bayes log-linear model as well as on a linear regression model.
Differentially private model selection with penalized and constrained likelihood
This work shows that model selection procedures based on penalized least squares or likelihood can be made differentially private by a combination of regularization and randomization, and proposes two algorithms to do so.
Differentially private posterior summaries for linear regression coefficients
This article proposes some differentially private algorithms for reporting posterior probabilities and posterior quantiles of linear regression coefficients that use the general strategy of subsample and aggregate, a technique that requires randomly partitioning the data into disjoint subsets, estimating the regression within each subset, and combining results in ways that satisfy differential privacy.
Elliptical Perturbations for Differential Privacy
It is shown that the privacy parameter $\ep$ can be computed in terms of a one-dimensional maximization problem and that when the dimension of the space is infinite, no elliptical distribution can be used to give $\ep-DP; only $(\epsilon,\delta)$-DP is possible.
Finite Sample Differentially Private Confidence Intervals ( Extended Abstract ) ∗
This work considers both the known and unknown variance cases and construct differentially private algorithms to estimate confidence intervals, and guarantees a finite sample coverage, as opposed to an asymptotic coverage.
Finite Sample Differentially Private Confidence Intervals
These algorithms guarantee a finite sample coverage, as opposed to an asymptotic coverage, and prove lower bounds on the expected size of any differentially private confidence set showing that the parameters are optimal up to polylogarithmic factors.
Differentially Private Significance Tests for Regression Coefficients
Algorithms for assessing whether regression coefficients of interest are statistically significant or not are presented and conditions under which the algorithms should give accurate answers about statistical significance are described.
A Differentially Private Bayesian Approach to Replication Analysis
This paper presents two methods for replication analysis and illustrates the properties of these methods by a combination of theoretical analysis and simulation.
Di ↵ erentially Private Verification of Predictions from Synthetic Data by Haoyang Yu
Di↵erentially Private Verification of Predictions from Synthetic Data by Haoyang Yu Program in Statistical and Economic Modeling Duke University
Confidentiality and Differential Privacy in the Dissemination of Frequency Tables
This paper studies confidentiality protection for perturbed frequency tables, including the trade-off with analytical utility, focusing on a version of the ABS TableBuilder as a concrete example of a data release mechanism, and examining its properties.


Probabilistic Inference and Differential Privacy
It is found that probabilistic inference can improve accuracy, integrate multiple observations, measure uncertainty, and even provide posterior distributions over quantities that were not directly measured.
Personal privacy vs population privacy: learning to attack anonymization
It is demonstrated that even under Differential Privacy, such classifiers can be used to infer "private" attributes accurately in realistic data and it is observed that the accuracy of inference of private attributes for differentially private data and $l$-diverse data can be quite similar.
Calibrating Noise to Sensitivity in Private Data Analysis
The study is extended to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f, which is the amount that any single argument to f can change its output.
Differential Privacy and the Risk-Utility Tradeoff for Multi-dimensional Contingency Tables
This paper explores how well the mechanism works in the context of a series of examples, and the extent to which the proposed differential-privacy mechanism allows for sensible inferences from the released data.
Differentially Private Exponential Random Graphs
This work uses the randomized response mechanism to release networks under $\epsilon$-edge differential privacy, and proposes a way to use likelihood based inference and Markov chain Monte Carlo techniques to fit ERGMs to the produced synthetic networks.
A Statistical Framework for Differential Privacy
This work studies a general privacy method, called the exponential mechanism, introduced by McSherry and Talwar (2007), and shows that the accuracy of this method is intimately linked to the rate at which the probability that the empirical distribution concentrates in a small ball around the true distribution.
Bayesian parameter estimation via variational methods
It is shown that an accurate variational transformation can be used to obtain a closed form approximation to the posterior distribution of the parameters thereby yielding an approximate posterior predictive model.
Information preservation in statistical privacy and bayesian estimation of unattributed histograms
In statistical privacy, utility refers to two concepts: information preservation -- how much statistical information is retained by a sanitizing algorithm, and usability -- how (and with how much
Our Data, Ourselves: Privacy Via Distributed Noise Generation
This work provides efficient distributed protocols for generating shares of random noise, secure against malicious participants, and introduces a technique for distributing shares of many unbiased coins with fewer executions of verifiable secret sharing than would be needed using previous approaches.
Accurate Estimation of the Degree Distribution of Private Networks
An efficient algorithm for releasing a provably private estimate of the degree distribution of a network, showing that the algorithm's variance and bias is low, that the error diminishes as the size of the input graph increases, and that common analyses like fitting a power-law can be carried out very accurately.