Easy Differentially Private Linear Regression

  title={Easy Differentially Private Linear Regression},
  author={Kareem Amin and Matthew Joseph and M{\'o}nica Ribero and Sergei Vassilvitskii},
Linear regression is a fundamental tool for statistical analysis. This has motivated the development of linear regression methods that also satisfy differential privacy and thus guarantee that the learned model reveals little about any one data point used to construct it. However, existing differentially private solutions assume that the end user can easily specify good data bounds and hyperparameters. Both present significant practical obstacles. In this paper, we study an algorithm which uses… 



Differentially Private Regression with Unbounded Covariates

Through the case of Binary Regression, this work captures the fundamental and widely-studied models of logistic regression and linearly-separable SVMs, learning an unbiased estimate of the true regression vector, up to a scaling factor.

Revisiting differentially private linear regression: optimal and adaptive prediction & estimation in unbounded domain

This work revisits the problem of linear regression under a differential privacy constraint and proposes simple modifications of two existing DP algorithms that can be upgraded into adaptive algorithms that are able to exploit data-dependent quantities and behave nearly optimally **for every instance**.

Private Convex Empirical Risk Minimization and High-dimensional Regression

This work significantly extends the analysis of the “objective perturbation” algorithm of Chaudhuri et al. (2011) for convex ERM problems, and gives the best known algorithms for differentially private linear regression.

Differentially Private Simple Linear Regression

A thorough experimental evaluation of differentially private algorithms for simple linear regression on small datasets with tens to hundreds of records is performed, finding that algorithms based on robust estimators—in particular, the median-based estimator of Theil and Sen—perform best on small dataset, while algorithmsbased on Ordinary Least Squares or Gradient Descent perform better for large datasets.

Differentially Private Estimation via Statistical Depth

Standard notions of statistical depth, i.e., halfspace depth and regression depth, are shown to be particularly advantageous in this regard, both in the sense that the maximum influence of a single observation is easy to analyze and that this value is typically low.

Private selection from private candidates

This work considers the selection problem under a much weaker stability assumption on the candidates, namely that the score functions are differentially private, and presents algorithms that are near-optimal along the three relevant dimensions: privacy, utility and computational efficiency.

Differential privacy and robust statistics in high dimensions

A universal framework for characterizing the statistical efficiency of a statistical estimation problem with differential privacy guarantees, which builds upon three crucial components: the exponential mechanism, robust statistics, and the Propose-Test-Release mechanism, and which provides tight local sensitivity bounds.

Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds

This work provides new algorithms and matching lower bounds for differentially private convex empirical risk minimization assuming only that each data point's contribution to the loss function is Lipschitz and that the domain of optimization is bounded.

Hyperparameter Tuning with Renyi Differential Privacy

The analysis supports the previous observation that tuning hyperparameters does indeed leak private information, but it is proved that, under certain assumptions, this leakage is modest, as long as each candidate training run needed to selecthyperparameters is itself differentially private.

Robust and Differentially Private Mean Estimation

This work introduces PRIME, which is the first efficient algorithm that achieves both privacy and robustness for a wide range of distributions and complements this result with a novel exponential time algorithm that improves the sample complexity of PRIME.