• Corpus ID: 235376858

Prior-Aware Distribution Estimation for Differential Privacy

  title={Prior-Aware Distribution Estimation for Differential Privacy},
  author={Yuchao Tao and Johes Bater and Ashwin Machanavajjhala},
Joint distribution estimation of a dataset under differential privacy is a fundamental problem for many privacyfocused applications, such as query answering, machine learning tasks and synthetic data generation. In this work, we examine the joint distribution estimation problem given two data points: 1) differentially private answers of a workload computed over private data and 2) a prior empirical distribution from a public dataset. Our goal is to find a new distribution such that estimating… 

Figures from this paper


Graphical-model based estimation and inference for differential privacy
This work provides an approach to solve this estimation problem efficiently using graphical models, which is particularly effective when the distribution is high-dimensional but the measurements are over low-dimensional marginals.
Differentially Private High-Dimensional Data Publication via Sampling-Based Inference
This paper proposes a novel solution to preserve the joint distribution of a high-dimensional dataset using an integer programming relaxation and the constrained concave-convex procedure and proves that selecting the optimal marginals with the goal of minimizing error is NP-hard.
PrivBayes: private data release via bayesian networks
PrivBayes, a differentially private method for releasing high-dimensional data that circumvents the curse of dimensionality, and introduces a novel approach that uses a surrogate function for mutual information to build the model more accurately.
Optimizing error of high-dimensional statistical queries under differential privacy
HDRM is proposed, a new differentially private algorithm for answering a workload of predicate counting queries, that is especially effective for higher-dimensional datasets and can efficiently answer queries with lower error than state-of-the-art techniques on a variety of low and high dimensional datasets.
Privately Learning Markov Random Fields
It is shown that only structure learning under approximate differential privacy maintains the non-private logarithmic dependence on the dimensionality of the data, while a change in either the learning goal or the privacy notion would necessitate a polynomial dependence.
Monitoring web browsing behavior with differential privacy
This paper adopts differential privacy, a strong, provable privacy definition, and shows that differentially private aggregates of web browsing activities can be released in real-time while preserving the utility of shared data.
Calibrating Noise to Sensitivity in Private Data Analysis
The study is extended to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f, which is the amount that any single argument to f can change its output.
The Estimation of Distributions and the Minimum Relative Entropy Principle
The relationship of EDA to algorithms developed in statistics, artificial intelligence, and statistical physics is explained within a general interdisciplinary framework and it is shown that maximum entropy approximations play a crucial role.
A Simple and Practical Algorithm for Differentially Private Data Release
A new algorithm for differentially private data release, based on a simple combination of the Multiplicative Weights update rule with the Exponential Mechanism, which achieves what are the best known and nearly optimal theoretical guarantees while being simple to implement and experimentally more accurate on actual data sets than existing techniques.
Practicing Differential Privacy in Health Care: A Review
The current literature on differential privacy is reviewed and important general limitations to the model and the proposed mechanisms are highlighted, including the theoretical nature of the privacy parameter epsilon are highlighted.