Robust and Verifiable Information Embedding Attacks to Deep Neural Networks via Error-Correcting Codes

  title={Robust and Verifiable Information Embedding Attacks to Deep Neural Networks via Error-Correcting Codes},
  author={Jinyuan Jia and Binghui Wang and N. Gong},
  journal={Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security},
In the era of deep learning, a user often leverages a third-party machine learning tool to train a deep neural network (DNN) classifier and then deploys the classifier as an end-user software product (e.g., a mobile app) or a cloud service. In an information embedding attack, an attacker is the provider of a malicious third-party machine learning tool. The attacker embeds a message into the DNN classifier during training and recovers the message via querying the API of the black-box classifier… Expand

Figures and Tables from this paper


Practical Black-Box Attacks against Machine Learning
This work introduces the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge, and finds that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder. Expand
BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain
It is shown that outsourced training introduces new security risks: an adversary can create a maliciously trained network (a backdoored neural network, or a BadNet) that has state-of-the-art performance on the user's training and validation samples, but behaves badly on specific attacker-chosen inputs. Expand
Protecting Intellectual Property of Deep Neural Networks with Watermarking
By extending the intrinsic generalization and memorization capabilities of deep neural networks, the models to learn specially crafted watermarks at training and activate with pre-specified predictions when observing the watermark patterns at inference, this paper generalizes the "digital watermarking'' concept from multimedia ownership verification to deep neural network (DNN) models. Expand
Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning
This work considers a new type of attacks, called backdoor attacks, where the attacker's goal is to create a backdoor into a learning-based authentication system, so that he can easily circumvent the system by leveraging the backdoor. Expand
Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks
Fine-pruning is evaluated, a combination of pruning and fine-tuning, and it is shown that it successfully weakens or even eliminates the backdoors, i.e., in some cases reducing the attack success rate to 0% with only a \(0.4\%\) drop in accuracy for clean (non-triggering) inputs. Expand
Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring
This work presents an approach for watermarking Deep Neural Networks in a black-box way, and shows experimentally that such a watermark has no noticeable impact on the primary task that the model is designed for. Expand
Trojaning Attack on Neural Networks
A trojaning attack on neuron networks that can be successfully triggered without affecting its test accuracy for normal input data, and it only takes a small amount of time to attack a complex neuron network model. Expand
Stealing Machine Learning Models via Prediction APIs
Simple, efficient attacks are shown that extract target ML models with near-perfect fidelity for popular model classes including logistic regression, neural networks, and decision trees against the online services of BigML and Amazon Machine Learning. Expand
Machine Learning with Membership Privacy using Adversarial Regularization
It is shown that the min-max strategy can mitigate the risks of membership inference attacks (near random guess), and can achieve this with a negligible drop in the model's prediction accuracy (less than 4%). Expand
MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples
This work proposes MemGuard, the first defense with formal utility-loss guarantees against black-box membership inference attacks and is the first one to show that adversarial examples can be used as defensive mechanisms to defend against membership inference attack. Expand