• Corpus ID: 222290496

# Open-sourced Dataset Protection via Backdoor Watermarking

@article{Li2020OpensourcedDP,
title={Open-sourced Dataset Protection via Backdoor Watermarking},
author={Yiming Li and Zi-Mou Zhang and Jiawang Bai and Baoyuan Wu and Yong Jiang and Shutao Xia},
journal={ArXiv},
year={2020},
volume={abs/2010.05821}
}
• Published 12 October 2020
• Computer Science
• ArXiv
The rapid development of deep learning has benefited from the release of some high-quality open-sourced datasets ($e.g.$, ImageNet), which allows researchers to easily verify the effectiveness of their algorithms. Almost all existing open-sourced datasets require that they can only be adopted for academic or educational purposes rather than commercial purposes, whereas there is still no good way to protect them. In this paper, we propose a \emph{backdoor embedding based dataset watermarking…

## Figures and Tables from this paper

### On the Effectiveness of Dataset Watermarking

• Computer Science
IWSPA@CODASPY
• 2022
It is shown that radioactive data can effectively survive model extraction attacks, which raises the possibility that it can be used for ML model ownership verification robust against model extraction.

### Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection

• Yiming LiYang BaiYong JiangYong YangBo Li
• Computer Science
• 2022
Deep neural networks (DNNs) have demonstrated their superiority in practice. Arguably, the rapid development of DNNs is largely beneﬁted from high-quality (open-sourced) datasets, based on which

### Data Isotopes for Data Provenance in DNNs

• Computer Science
ArXiv
• 2022
This work designs, implements and evaluates a practical system that enables users to detect if their data was used to train an DNN model, and shows how users can create special data points, which introduce “spurious features” into DNNs during training.

### MOVE: Effective and Harmless Ownership Verification via Embedded External Features

• Computer Science
ArXiv
• 2022
This paper proposes an effective and harmless model ownership veriﬁcation to defend against different types of model stealing simultaneously, without introducing new security risks, and develops the MOVE method under both white-box and black-box settings to provide comprehensive model protection.

### A Survey of Neural Trojan Attacks and Defenses in Deep Learning

• Computer Science
ArXiv
• 2022
A comprehensive review of the techniques that devise Trojan attacks for deep learning and explore their defenses, and provides a comprehensible gateway to the broader community to understand the recent developments in Neural Trojans.

### CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning

• Computer Science
WWW
• 2022
It is argued that there is a need to invent effective mechanisms for protecting open-source code from being exploited by deep learning models, and a prototype, CoProtector, is designed and implemented, which utilizes data poisoning techniques to arm source code repositories for defending against such exploits.

### Defending against Model Stealing via Verifying Embedded External Features

• Computer Science
AAAI
• 2022
Experimental results demonstrate that the method is effective in detecting different types of model stealing simultaneously, even if the stolen model is obtained via a multi-stage stealing process.

### Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural Networks

• Computer Science
• 2021
To the best knowledge, this work is the first to protect an individual user’s data ownership from unauthorized use in training neural networks.

### Backdoor Learning: A Survey

• Computer Science
ArXiv
• 2020
This article summarizes and categorizes existing backdoor attacks and defenses based on their characteristics, and provides a unified framework for analyzing poisoning-based backdoor attacks, and summarizes widely adopted benchmark datasets.

## References

SHOWING 1-10 OF 32 REFERENCES

### Invisible Backdoor Attacks on Deep Neural Networks Via Steganography and Regularization

• Computer Science
IEEE Transactions on Dependable and Secure Computing
• 2021
It is argued that the proposed invisible backdoor attacks can effectively thwart the state-of-the-art trojan backdoor detection approaches.

### A New Robust Approach for Reversible Database Watermarking with Distortion Control

• Computer Science
IEEE Transactions on Knowledge and Data Engineering
• 2019
Experimental results demonstrate the effectiveness of GAHSW and show that it outperforms state-of-the-art approaches in terms of robustness against malicious attacks and preservation of data quality.

### Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning

• Computer Science
ArXiv
• 2017
This work considers a new type of attacks, called backdoor attacks, where the attacker's goal is to create a backdoor into a learning-based authentication system, so that he can easily circumvent the system by leveraging the backdoor.

### Backdoor Learning: A Survey

• Computer Science
ArXiv
• 2020
This article summarizes and categorizes existing backdoor attacks and defenses based on their characteristics, and provides a unified framework for analyzing poisoning-based backdoor attacks, and summarizes widely adopted benchmark datasets.

### Rethinking the Trigger of Backdoor Attack

• Computer Science
ArXiv
• 2020
This paper demonstrates that many backdoor attack paradigms are vulnerable when the trigger in testing images is not consistent with the one used for training, and proposes a transformation-based attack enhancement to improve the robustness of existing attacks towards transformation- based defense.

### BadNets: Evaluating Backdooring Attacks on Deep Neural Networks

• Computer Science
IEEE Access
• 2019
It is shown that the outsourced training introduces new security risks: an adversary can create a maliciously trained network (a backdoored neural network, or a BadNet) that has the state-of-the-art performance on the user's training and validation samples but behaves badly on specific attacker-chosen inputs.

### A Robust and Reversible Watermarking Technique for Relational Dataset Based on Clustering

• Computer Science
2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE)
• 2019
A cluster-based robust and reversible watermarking (RRWC) technique for relational data has been proposed that provides a solution to two major function: ownership rights protection and partial data traceability.

### How To Backdoor Federated Learning

• Computer Science
AISTATS
• 2020
This work designs and evaluates a new model-poisoning methodology based on model replacement and demonstrates that any participant in federated learning can introduce hidden backdoor functionality into the joint global model, e.g., to ensure that an image classifier assigns an attacker-chosen label to images with certain features.

### A Survey on Neural Trojans

• Computer Science
2020 21st International Symposium on Quality Electronic Design (ISQED)
• 2020
This paper surveys a myriad of neural Trojan attack and defense techniques that have been proposed over the last few years and systematizes the above attack anddefense approaches.