Additive Logistic Mechanism for Privacy-Preserving Self-Supervised Learning

  title={Additive Logistic Mechanism for Privacy-Preserving Self-Supervised Learning},
  author={Yunhao Yang and Parham Gohari and Ufuk Topcu},
We study the privacy risks that are associated with training a neural network’s weights with self-supervised learning algorithms. Through empirical evidence, we show that the fine-tuning stage, in which the network weights are updated with an informative and often private dataset, is vulnerable to privacy attacks. To address the vulnerabilities, we design a post-training privacy-protection algorithm that adds noise to the fine-tuned weights and propose a novel differential privacy mechanism that… 

Figures and Tables from this paper


The Algorithmic Foundations of Differential Privacy
The preponderance of this monograph is devoted to fundamental techniques for achieving differential privacy, and application of these techniques in creative combinations, using the query-release problem as an ongoing example.
A Simple Framework for Contrastive Learning of Visual Representations
It is shown that composition of data augmentations plays a critical role in defining effective predictive tasks, and introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning.
Membership Inference Attacks Against Machine Learning Models
This work quantitatively investigates how machine learning models leak information about the individual data records on which they were trained and empirically evaluates the inference techniques on classification models trained by commercial "machine learning as a service" providers such as Google and Amazon.
Deep Learning with Differential Privacy
This work develops new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy, and demonstrates that deep neural networks can be trained with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.
Our Data, Ourselves: Privacy Via Distributed Noise Generation
This work provides efficient distributed protocols for generating shares of random noise, secure against malicious participants, and introduces a technique for distributing shares of many unbiased coins with fewer executions of verifiable secret sharing than would be needed using previous approaches.
Calibrating Noise to Sensitivity in Private Data Analysis
The study is extended to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f, which is the amount that any single argument to f can change its output.
Learning Multiple Layers of Features from Tiny Images
It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.
A Differentially Private Framework for Deep Learning With Convexified Loss Functions
A novel output perturbation framework is proposed by injecting DP noise into a randomly sampled neuron at the output layer of a baseline non-private neural network trained with a convexified loss function to achieve a better privacy-utility trade-off than existing DP-SGD implementations.
10 Security and Privacy Problems in Self-Supervised Learning
This book chapter discusses 10 basic security and privacy problems for the pre-trained encoders in self-supervised learning, including six confidentiality problems, three integrity problems, and one availability problem.
On the Privacy Risks of Deploying Recurrent Neural Networks in Machine Learning Models
It is demonstrated that RNNs are subject to a higher attack accuracy than feed-forward neural network (FFNN) counterparts, and can also be less amenable to mitigation methods, bringing to the conclusion that the privacy risks pertaining to the recurrent architecture are higher than theFeed-forward counterparts.