Differentially Private Empirical Risk Minimization

@article{Chaudhuri2011DifferentiallyPE,
  title={Differentially Private Empirical Risk Minimization},
  author={Kamalika Chaudhuri and Claire Monteleoni and Anand D. Sarwate},
  journal={Journal of machine learning research : JMLR},
  year={2011},
  volume={12},
  pages={
          1069-1109
        }
}
Privacy-preserving machine learning algorithms are crucial for the increasingly common setting in which personal data, such as medical or financial records, are analyzed. We provide general techniques to produce privacy-preserving approximations of classifiers learned via (regularized) empirical risk minimization (ERM). These algorithms are private under the ε-differential privacy definition due to Dwork et al. (2006). First we apply the output perturbation ideas of Dwork et al. (2006), to ERM… 

Figures and Tables from this paper

Private Convex Empirical Risk Minimization and High-dimensional Regression

TLDR
This work significantly extends the analysis of the “objective perturbation” algorithm of Chaudhuri et al. (2011) for convex ERM problems, and gives the best known algorithms for differentially private linear regression.

Practical differential privacy in high dimensions

TLDR
This work proposes a new feature selection mechanism, which fits well with the design constraints imposed by differential privacy, and allows for improved scalability of private classifiers in realistic settings and investigates differentially private Naive Bayes and Logistic Regression and shows non-trivial performance on a number of datasets.

Private Convex Optimization for Empirical Risk Minimization with Applications to High-dimensional Regression

TLDR
This work significantly extends the analysis of the “objective perturbation” algorithm of Chaudhuri et al. (2011) for convex ERM problems, and gives the best known algorithms for differentially private linear regression.

Privacy-Preserving Cost-Sensitive Learning

TLDR
This work develops a unified framework for existing cost-sensitive learning methods by incorporating the weight constant and weight functions into the classical regularized empirical risk minimization framework and proposes two privacy-preserving algorithms with output perturbation and objective perturbations methods to be integrated with the cost- sensitive learning framework.

Projection-free Online Empirical Risk Minimization with Privacy-preserving and Privacy Expiration

  • Jian LouY. Cheung
  • Computer Science
    2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)
  • 2020
TLDR
This work proposes the projection- free COCO with differential privacy guarantee, a de facto standard for privacy preserving, and is the first projection-free differentially private C OCO, and thus broadens the applicability of COCo with privacy guarantee.

Dynamic Privacy For Distributed Machine Learning Over Network

TLDR
This paper develops two methods to provide differential privacy to distributed learning algorithms over a network using the alternating direction method of multipliers (ADMM) and the methods of dual variable perturbation and primal variable perturgation to provide dynamic differential privacy.

When Relaxations Go Bad: "Differentially-Private" Machine Learning

TLDR
Current mechanisms for differential privacy for machine learning rarely offer acceptable utility-privacy tradeoffs: settings that provide limited accuracy loss provide little effective privacy, and settings that provided strong privacy result in useless models.

Differentially Private Empirical Risk Minimization with Sparsity-Inducing Norms

TLDR
This is the first work that analyzes the dual optimization problems of risk minimization problems in the context of differential privacy with a particular class of convex but non-smooth regularizers that induce structured sparsity and loss functions for generalized linear models.

Matrix Gaussian Mechanisms for Differentially-Private Learning

TLDR
This work proposes Matrix Gaussian Mechanism (MGM), a new (, δ)-differential privacy mechanism for preserving learning data privacy, and introduces two mechanisms based on MGM with an improved utility.

Dynamic Differential Privacy for Distributed Machine Learning over Networks

TLDR
This paper develops two methods to provide differential privacy to distributed learning algorithms over a network using the alternating direction method of multipliers (ADMM) and the methods of dual variable perturbation and primal variable perturgation to provide dynamic and networked differential privacy.
...

References

SHOWING 1-10 OF 60 REFERENCES

Privacy-preserving logistic regression

TLDR
This paper addresses the important tradeoff between privacy and learnability, when designing algorithms for learning from private databases by providing a privacy-preserving regularized logistic regression algorithm based on a new privacy- Preserving technique.

Learning in a Large Function Space: Privacy-Preserving Mechanisms for SVM Learning

TLDR
This paper explores the release of Support Vector Machine (SVM) classifiers while preserving the privacy of training data and presents efficient mechanisms for finite-dimensional feature mappings and for (potentially infinite-dimensional) mappings with translation-invariant kernels.

Bounds on the sample complexity for private learning and private data release

TLDR
This work examines several private learning tasks and gives tight bounds on their sample complexity, and shows strong separations between sample complexities of proper and improper private learners (such separation does not exist for non-private learners), and between sample complexity of efficient and inefficient proper private learners.

Limiting privacy breaches in privacy preserving data mining

TLDR
This paper presents a new formulation of privacy breaches, together with a methodology, "amplification", for limiting them, and instantiate this methodology for the problem of mining association rules, and modify the algorithm from [9] to limit privacy breaches without knowledge of the data distribution.

Calibrating Noise to Sensitivity in Private Data Analysis

TLDR
The study is extended to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f, which is the amount that any single argument to f can change its output.

A Statistical Framework for Differential Privacy

TLDR
This work studies a general privacy method, called the exponential mechanism, introduced by McSherry and Talwar (2007), and shows that the accuracy of this method is intimately linked to the rate at which the probability that the empirical distribution concentrates in a small ball around the true distribution.

Differential privacy with compression

TLDR
It is shown that, despite the general difficulty of achieving the differential privacy guarantee, it is possible to publish synthetic data that are useful for a number of common statistical learning applications based on the covariance of the initial data.

L-diversity: privacy beyond k-anonymity

TLDR
This paper shows with two simple attacks that a \kappa-anonymized dataset has some subtle, but severe privacy problems, and proposes a novel and powerful privacy definition called \ell-diversity, which is practical and can be implemented efficiently.

What Can We Learn Privately?

TLDR
This work investigates learning algorithms that satisfy differential privacy, a notion that provides strong confidentiality guarantees in the contexts where aggregate information is released about a database containing sensitive information about individuals.

Composition attacks and auxiliary information in data privacy

TLDR
This paper investigates composition attacks, in which an adversary uses independent anonymized releases to breach privacy, and provides a precise formulation of this property, and proves that an important class of relaxations of differential privacy also satisfy the property.
...