Private Ad Modeling with DP-SGD

@article{Denison2022PrivateAM,
  title={Private Ad Modeling with DP-SGD},
  author={Carson E. Denison and Badih Ghazi and Pritish Kamath and Ravi Kumar and Pasin Manurangsi and Krishnagiri Narra and Amer Sinha and Avinash V. Varadarajan and Chiyuan Zhang},
  journal={ArXiv},
  year={2022},
  volume={abs/2211.11896}
}
A well-known algorithm in privacy-preserving ML is differentially private stochastic gradient descent (DP-SGD). While this algorithm has been evaluated on text and image data, it has not been previously applied to ads data, which are noto-rious for their high class imbalance and sparse gradient updates. In this work we apply DP-SGD to several ad modeling tasks including predicting click-through rates, conversion rates, and number of conversion events, and evaluate their privacy-utility trade… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 27 REFERENCES

AdaCliP: Adaptive Clipping for Private SGD

AdaCliP is proposed, a theoretically motivated differentially private SGD algorithm that provably adds less noise compared to the previous methods, by using coordinate-wise adaptive clipping of the gradient.

Unlocking High-Accuracy Differentially Private Image Classification through Scale

It is demonstrated that DP-SGD on over-parameterized models can perform significantly better than previously thought and is believed to be a step towards closing the accuracy gap between private and non-private image classi-cation benchmarks.

Deep Learning with Differential Privacy

This work develops new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy, and demonstrates that deep neural networks can be trained with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.

Toward Training at ImageNet Scale with Differential Privacy

Initial lessons from this effort to investigate how to train differential privacy training at scale are shared, showing approaches that help make DP training faster, as well as model types and settings of the training process that tend to work better in the DP setting.

Scaling up Differentially Private Deep Learning with Fast Per-Example Gradient Clipping

New methods for per-example gradient clipping that are compatible with auto-differeniation and provide better GPU utilization are derived by analyzing the back-propagation equations of Renyi Differential Privacy.

Training Text-to-Text Transformers with Privacy Guarantees

By using recent advances in JAX and XLA, this work can train models with DP that do not suffer a large drop in pre-training utility, nor in training speed, and can still be fine-tuned to high accuracies on downstream tasks (e.g. GLUE).

Fast and Memory Efficient Differentially Private-SGD via JL Projections

This paper proposes an algorithmic solution which works for any network in a black-box manner and trains a Recurrent Neural Network to achieve good privacy-vs-accuracy tradeoff, while being significantly faster than DP-SGD and with a similar memory footprint as non-private SGD.

Computing Tight Differential Privacy Guarantees Using FFT

A numerical accountant can be applied to the subsampled multidimensional Gaussian mechanism which underlies the popular DP stochastic gradient descent and gives the exact $(\varepsilon,\delta)$-values.

Tight on Budget?: Tight Bounds for r-Fold Approximate Differential Privacy

With privacy buckets, a numerical and widely applicable method for capturing the privacy loss of differentially private mechanisms under composition, which is called privacy buckets is presented and it is shown that for concrete sequences of mechanisms tighter bounds can be derived by taking the mechanisms' structure into account.

Differentially Private Learning with Adaptive Clipping

It is shown that adaptively setting the clipping norm applied to each user's update, based on a differentially private estimate of a target quantile of the distribution of unclipped norms, is sufficient to remove the need for such extensive parameter tuning.