• Corpus ID: 221586480

Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning

  title={Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning},
  author={Yang Zou and Zhikun Zhang and Michael Backes and Yang Zhang},
While being deployed in many critical applications as core components, machine learning (ML) models are vulnerable to various security and privacy attacks. One major privacy attack in this domain is membership inference, where an adversary aims to determine whether a target data sample is part of the training set of a target ML model. So far, most of the current membership inference attacks are evaluated against ML models trained from scratch. However, real-world ML models are typically trained… 

Figures and Tables from this paper

FaceLeaks: Inference Attacks against Transfer Learning Models via Black-box Queries

This extensive study indicates that information leakage is a real privacy threat to the transfer learning framework widely used in real-life situations.

TransMIA: Membership Inference Attacks Using Transfer Shadow Training

A transfer shadow training technique is proposed, where an adversary employs the parameters of the transferred model to construct shadow models, to significantly improve the performance of membership inference when a limited amount of shadow training data is available to the adversary.

How Does a Deep Learning Model Architecture Impact Its Privacy?

This paper investigates several representative model architectures from CNNs to Transformers, and shows that Transformers are generally more vulnerable to privacy attacks than CNNs, and demonstrates that the micro design of activation layers, stem layers, and bias parameters, are the major reasons why CNNs are more resilient to privacy Attacks.

Enhanced Membership Inference Attacks against Machine Learning Models

This paper presents a comprehensive hypothesis testing framework that enables not only to formally express the prior work in a consistent way, but also to design new membership inference attacks that use reference models to achieve a significantly higher power for any (false positive rate) error.

Membership Inference Attacks on Machine Learning: A Survey

This article provides the taxonomies for both attacks and defenses, based on their characterizations, and discusses their pros and cons, and point out several promising future research directions to inspire the researchers who wish to follow this area.

Accuracy-Privacy Trade-off in Deep Ensembles

This paper empirically demonstrate the trade-off between privacy and accuracy in deep ensemble learning and finds that ensembling can improve either privacy or accuracy, but not both simultaneously — when ensembled improves the classification accuracy, the effectiveness of the MI attack also increases.

Model Inversion Attack against Transfer Learning: Inverting a Model without Accessing It

Experiments show that highly recognizable data records can be recovered with both the black-box attacks, suiting different situations, that do not rely on queries to the target student model, meaning that even if a model is an inaccessible black- box, it can still be inverted.

Teacher Model Fingerprinting Attacks Against Transfer Learning

This paper proposes a teacher model fingerprinting attack to infer the origin of a student model, i.e., the teacher model it transfers from, and shows that the proposed attack can serve as a stepping stone to facilitating other attacks against machine learning models, such as model stealing.

SoK: On the Impossible Security of Very Large Foundation Models

Lower bounds on accuracy in privacy-preserving and Byzantine-resilient heterogeneous learning that, it is argued, constitute a compelling case against the possibility of designing a secure and privacy- Preserving high-accuracy foundation model.

Membership Inference Attacks and Defenses in Neural Network Pruning

This paper investigates the impacts of neural network pruning on training data privacy and proposes a self-attention membership inference attack against the pruned neural networks, as well as proposing a new defense mechanism to protect the pruning process by mitigating the prediction divergence based on KL-divergence distance.



Label-Leaks: Membership Inference Attack with Label

A systematic investigation of membership inference attack when the target model only provides the predicted label, which focuses on two adversarial settings and proposes different attacks, namely transfer-based attack and perturbation based attack.

Machine Learning with Membership Privacy using Adversarial Regularization

It is shown that the min-max strategy can mitigate the risks of membership inference attacks (near random guess), and can achieve this with a negligible drop in the model's prediction accuracy (less than 4%).

ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models

This most comprehensive study so far on this emerging and developing threat using eight diverse datasets which show the viability of the proposed attacks across domains and proposes the first effective defense mechanisms against such broader class of membership inference attacks that maintain a high level of utility of the ML model.

Membership Inference Attacks Against Machine Learning Models

This work quantitatively investigates how machine learning models leak information about the individual data records on which they were trained and empirically evaluates the inference techniques on classification models trained by commercial "machine learning as a service" providers such as Google and Amazon.

Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning

The reasons why deep learning models may leak information about their training data are investigated and new algorithms tailored to the white-box setting are designed by exploiting the privacy vulnerabilities of the stochastic gradient descent algorithm, which is the algorithm used to train deep neural networks.

Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting

The effect that overfitting and influence have on the ability of an attacker to learn information about the training data from machine learning models, either through training set membership inference or attribute inference attacks is examined.

Exploiting Unintended Feature Leakage in Collaborative Learning

This work shows that an adversarial participant can infer the presence of exact data points -- for example, specific locations -- in others' training data and develops passive and active inference attacks to exploit this leakage.

MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples

This work proposes MemGuard, the first defense with formal utility-loss guarantees against black-box membership inference attacks and is the first one to show that adversarial examples can be used as defensive mechanisms to defend against membership inference attack.

GAN-Leaks: A Taxonomy of Membership Inference Attacks against GANs

This paper presents the first taxonomy of membership inference attacks against GANs, which encompasses not only existing attacks but also the novel ones, and proposes the first generic attack model that can be instantiated in various settings according to adversary's knowledge about the victim model.

Stealing Machine Learning Models via Prediction APIs

Simple, efficient attacks are shown that extract target ML models with near-perfect fidelity for popular model classes including logistic regression, neural networks, and decision trees against the online services of BigML and Amazon Machine Learning.