• Corpus ID: 67876983

Exploring Connections Between Active Learning and Model Extraction

  title={Exploring Connections Between Active Learning and Model Extraction},
  author={Varun Chandrasekaran and Kamalika Chaudhuri and Irene Giacomelli and Somesh Jha and Songbai Yan},
  booktitle={USENIX Security Symposium},
Machine learning is being increasingly used by individuals, research institutions, and corporations. This has resulted in the surge of Machine Learning-as-a-Service (MLaaS) - cloud services that provide (a) tools and resources to learn the model, and (b) a user-friendly query interface to access the model. However, such MLaaS systems raise privacy concerns such as model extraction. In model extraction attacks, adversaries maliciously exploit the query interface to steal the model. More… 

ActiveThief: Model Extraction Using Active Learning and Unannotated Public Data

This work designs ActiveThief – a model extraction framework for deep neural networks that makes use of active learning techniques and unannotated public datasets to perform model extraction, and demonstrates that it is possible to use it to extract deep classifiers trained on a variety of datasets from image and text domains.

A framework for the extraction of Deep Neural Networks by leveraging public data

This work designs a model extraction framework that makes use of active learning and large public datasets to satisfy the three essential criteria for practical model extraction, and demonstrates that it is possible to use this framework to steal deep classifiers trained on a variety of datasets from image and text domains.

A Framework for Understanding Model Extraction Attack and Defense

New metrics to quantify the fundamental tradeoffs between model utility from a benign user’s view and privacy from an adversary's view are developed, and an optimization problem is developed to understand the optimal adversarial attack and defense strategies.

I Know What You Trained Last Summer: A Survey on Stealing Machine Learning Models and Defences

This work categorising and comparing model stealing attacks, assessing their performance, and exploring corresponding defence techniques in different settings, and proposes a taxonomy for attack and defence approaches, and provides guidelines on how to select the right attack or defence strategy based on the goal and available resources.

Machine learning privacy : analysis and implementation of model extraction attacks

MET is presented, which implements state-of-the-art model extraction attacks on arbitrary ML models and datasets and proposes and implements improvements for some of the attacks both in terms of speed and performance.

Model Extraction Attacks on Graph Neural Networks: Taxonomy and Realisation

This paper comprehensively investigate and develop model extraction attacks against GNN models and systematically formalise the threat modelling in the context of GNN model extraction and classify the adversarial threats into seven categories by considering different background knowledge of the attacker.

Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!

This work presents how an adversary can steal a BERT-based API service (the victim/target model) on multiple benchmark datasets with limited prior knowledge and queries, and shows that the extracted model can lead to highly transferable adversarial attacks against the victim model.

Data-Free Model Extraction

It is found that the proposed data-free model extraction approach achieves high-accuracy with reasonable query complexity – 0.99× and 0.92× the victim model accuracy on SVHN and CIFAR- 10 datasets given 2M and 20M queries respectively.

Stealing Machine Learning Models: Attacks and Countermeasures for Generative Adversarial Networks

This paper defines fidelity and accuracy on model extraction attacks against generative adversarial networks (GANs) and proposes effective defense techniques to safeguard GANs, considering a trade-off between the utility and security of GAN models.

Adversarial Model Extraction on Graph Neural Networks

This work formalizes an instance of GNN extraction, presents a solution with preliminary results, and discusses the assumptions and future directions.



Stealing Machine Learning Models via Prediction APIs

Simple, efficient attacks are shown that extract target ML models with near-perfect fidelity for popular model classes including logistic regression, neural networks, and decision trees against the online services of BigML and Amazon Machine Learning.

Towards the Science of Security and Privacy in Machine Learning

It is shown that there are (possibly unavoidable) tensions between model complexity, accuracy, and resilience that must be calibrated for the environments in which they will be used, and formally explores the opposing relationship between model accuracy and resilience to adversarial manipulation.

Membership Inference Attacks Against Machine Learning Models

This work quantitatively investigates how machine learning models leak information about the individual data records on which they were trained and empirically evaluates the inference techniques on classification models trained by commercial "machine learning as a service" providers such as Google and Amazon.

How to steal a machine learning classifier with deep learning

This paper presents an exploratory machine learning attack based on deep learning to infer the functionality of an arbitrary classifier by polling it as a black box, and using returned labels to

Practical Black-Box Attacks against Machine Learning

This work introduces the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge, and finds that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder.

Adversarial Machine Learning at Scale

This research applies adversarial training to ImageNet and finds that single-step attacks are the best for mounting black-box attacks, and resolution of a "label leaking" effect that causes adversarially trained models to perform better on adversarial examples than on clean examples.

Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers

It is shown that it is possible to infer unexpected but useful information from ML classifiers and that this kind of information leakage can be exploited by a vendor to build more effective classifiers or to simply acquire trade secrets from a competitor's apparatus, potentially violating its intellectual property rights.

Practical Evasion of a Learning-Based Classifier: A Case Study

A taxonomy for practical evasion strategies and adapt known evasion algorithms to implement specific scenarios in the authors' taxonomy is developed and a substantial drop of PDFrate's classification scores and detection accuracy is revealed after it is exposed even to simple attacks.

Adversarial machine learning

A taxonomy for classifying attacks against online machine learning algorithms and the limits of an adversary's knowledge about the algorithm, feature space, training, and input data are given.

Stealing Hyperparameters in Machine Learning

This work proposes attacks on stealing the hyperparameters that are learned by a learner, applicable to a variety of popular machine learning algorithms such as ridge regression, logistic regression, support vector machine, and neural network.