• Corpus ID: 232014629

Learning with User-Level Privacy

@inproceedings{Lvy2021LearningWU,
  title={Learning with User-Level Privacy},
  author={Daniel L{\'e}vy and Ziteng Sun and Kareem Amin and Satyen Kale and Alex Kulesza and Mehryar Mohri and Ananda Theertha Suresh},
  booktitle={NeurIPS},
  year={2021}
}
We propose and analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution ($m \ge 1$ samples), providing more stringent but more realistic protection against information leaks. We show that for high-dimensional mean estimation, empirical risk minimization with smooth losses, stochastic convex optimization, and learning hypothesis… 
Tight and Robust Private Mean Estimation with Few Users
TLDR
This work designs an (ε, δ)-differentially private mechanism using as few users as possible for high-dimensional mean estimation under user-level differential privacy, and provides a nearly optimal trade-off between the number of users and thenumber of samples per user required for private mean estimation.
Private Federated Learning Without a Trusted Server: Optimal Algorithms for Convex Losses
TLDR
This paper provides tight upper and lower bounds for LDP convex/strongly convex federated stochastic optimization with homogeneous (i.i.d.) client data, and shows that similar rates are attainable for smooth losses with arbitrary heterogeneous client data distributions, via a linear-time accelerated LDP algorithm.
Instance-optimal Mean Estimation Under Differential Privacy
TLDR
A mechanism that is instance-optimal in a strong sense, which adapts to a variety of data characteristics without the need of parameter tuning, and easily extends to the local and shuffle model as well.
A Private and Computationally-Efficient Estimator for Unbounded Gaussians
TLDR
The primary new technical tool in the algorithm is a new differentially private preconditioner that takes samples from an arbitrary Gaussian N(0, Σ) and returns a matrix such that Σ ) has constant condition number.
Differential Privacy for Heterogeneous Federated Learning
  • Computer Science
  • 2021
TLDR
This work establishes both privacy and utility guarantees, which show the superiority of DP-SCAFFOLD over the naive algorithm DP-FedAvg, to tackle heterogeneity issues under differential privacy (DP) constraints in a federated learning framework.
On Privacy and Personalization in Cross-Silo Federated Learning
TLDR
This work shows that mean-regularized multi-task learning (MR-MTL), a simple personalization framework, is a strong baseline for cross-silo FL: under stronger privacy, silos are further incentivized to “federate” with each other to mitigate DP noise, resulting in consistent improvements relative to standard baseline methods.
Histogram Estimation under User-level Privacy with Heterogeneous Data
TLDR
This work proposes an algorithm based on a clipping strategy that almost achieves a two-approximation with respect to the best clipping threshold in hindsight and proves that the clipping bias can be significantly reduced when the counts are from non-i.i.d. Poisson distributions.
New Lower Bounds for Private Estimation and a Generalized Fingerprinting Lemma
TLDR
New lower bounds for statistical estimation tasks under the constraint of p ε, δ q differential privacy are proved and a tight Ω ` d α 2 ε ˘ lower bound for estimating the mean of a distribution with bounded covariance to α -error in ℓ 2 -distance is shown.
Network change point localisation under local differential privacy
TLDR
This paper investigates the fundamental limits in consistently localising change points under both node and edge privacy constraints, demonstrating interesting phase transition in terms of the signal-to-noise ratio condition, accompanied by polynomial-time algorithms.
Private Non-Convex Federated Learning Without a Trusted Server
TLDR
Novel algorithms that satisfy local differential privacy at the client level and shuffle differential privacy (SDP) for three classes of Lipschitz continuous loss functions are proposed, including the first DP algorithms for non-convex/non-smooth loss functions.
...
...

References

SHOWING 1-10 OF 73 REFERENCES
Learning discrete distributions: user vs item-level privacy
TLDR
This work studies the fundamental problem of learning discrete distributions over $k$ symbols with user-level differential privacy and proposes a mechanism such that the number of users scales as $\tilde{\mathcal{O}}(k/(m\alpha^2) + k/\sqrt{m}\epsilon\alpha)$ and shows that it is nearly-optimal under certain regimes.
The Cost of Privacy: Optimal Rates of Convergence for Parameter Estimation with Differential Privacy
TLDR
This paper investigates the tradeoff between statistical accuracy and privacy in mean estimation and linear regression, under both the classical low-dimensional and modern high-dimensional settings, and forms a general lower bound argument for minimax risks with differential privacy constraints.
User-Level Private Learning via Correlated Sampling
TLDR
This work shows that, as long as each user receives sufficiently many samples, the authors can learn any privately learnable class via an (ε, δ)-DP algorithm using only O(log(1/δ)/ε) users, and shows a nearly-matching lower bound on the number of users required.
Smoothly Bounding User Contributions in Differential Privacy
TLDR
This work proposes a method which smoothly bounds user contributions by setting appropriate weights on data points and applies it to estimating the mean/quantiles, linear regression, and empirical risk minimization and shows that the algorithm provably outperforms the sample limiting algorithm.
Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds
TLDR
This work provides new algorithms and matching lower bounds for differentially private convex empirical risk minimization assuming only that each data point's contribution to the loss function is Lipschitz and that the domain of optimization is bounded.
Private Convex Empirical Risk Minimization and High-dimensional Regression
TLDR
This work significantly extends the analysis of the “objective perturbation” algorithm of Chaudhuri et al. (2011) for convex ERM problems, and gives the best known algorithms for differentially private linear regression.
Private Mean Estimation of Heavy-Tailed Distributions
TLDR
Algorithms for the multivariate setting whose sample complexity is a factor of $O(d)$ larger than the univariate case are given, for which the sample simplicity is identical for all $k \geq 2$.
Private stochastic convex optimization: optimal rates in linear time
TLDR
Two new techniques for deriving DP convex optimization algorithms both achieving the optimal bound on excess loss and using O(min{n, n 2/d}) gradient computations are described.
Bounding User Contributions: A Bias-Variance Trade-off in Differential Privacy
TLDR
It is shown that in general there is a “sweet spot” that depends on measurable properties of the dataset, but that there is also a concrete cost to privacy that cannot be avoided simply by collecting more data.
Private Stochastic Convex Optimization with Optimal Rates
TLDR
The approach builds on existing differentially private algorithms and relies on the analysis of algorithmic stability to ensure generalization and implies that, contrary to intuition based on private ERM, private SCO has asymptotically the same rate of $1/\sqrt{n}$ as non-private SCO in the parameter regime most common in practice.
...
...