Cracking White-box DNN Watermarks via Invariant Neuron Transforms

  title={Cracking White-box DNN Watermarks via Invariant Neuron Transforms},
  author={Yifan Yan and Xudong Pan and Yining Wang and Mi Zhang and Min Yang},
—Recently, how to protect the Intellectual Property (IP) of deep neural networks (DNN) becomes a major concern for the AI industry. To combat potential model piracy, recent works explore various watermarking strategies to embed secret identity messages into the prediction behaviors or the internals (e.g., weights and neuron activation) of the target model. Sacrificing less functionality and involving more knowledge about the target model, the latter branch of watermarking schemes (i.e., white… 

Figures and Tables from this paper

CATER: Intellectual Property Protection on Text Generation APIs via Conditional Watermarks

It is proved that it is infeasible for even the savviest attacker to reveal the used watermarks from a large pool of potential word pairs based on statistical inspection, and an optimization method is proposed to decide the watermarking rules that can minimize the distortion of overall word distributions while maximizing the change of conditional word selections.

Tracking Dataset IP Use in Deep Neural Networks

A novel DNN fingerprinting technique dubbed D EEP T ASTER is proposed to prevent a new attack scenario in which a victim’s data is stolen to build a suspect model, and can effectively detect such data theft attacks even when a suspect models’ architecture differs from a victim model's.



RIGA: Covert and Robust White-Box Watermarking of Deep Neural Networks

This paper proposes Robust Watermarking (RIGA), a novel white-box watermarking algorithm that uses adversarial training that significantly improves the covertness and robustness over the current state-of-art.

Watermarking Deep Neural Networks with Greedy Residuals

This paper greedily select a few and important model parameters for embedding so that the impairment caused by the changed parameters can be reduced and the robustness against different attacks can be improved as the selected parameters can well preserve the model information.

Attacks on Digital Watermarks for Deep Neural Networks

  • Tianhao WangF. Kerschbaum
  • Computer Science
    ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2019
This paper shows that a detection algorithm can not only detect the presence of a watermark, but even derive its embedding length and use this information to remove the watermark by overwriting it and proposes a possible countermeasure.

DeepIP: Deep Neural Network Intellectual Property Protection with Passports.

Novel passport-based DNN ownership verification schemes which are both robust to network modifications and resilient to ambiguity attacks are proposed and extensive experimental results justify the effectiveness of the proposed schemes.

Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring

This work presents an approach for watermarking Deep Neural Networks in a black-box way, and shows experimentally that such a watermark has no noticeable impact on the primary task that the model is designed for.

Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal Attack for DNN Models

A novel watermark removal attack by combining imperceptible pattern embedding and spatial-level transformations, which can effectively and blindly destroy the memorization of watermarked models to the watermark samples.

Copy, Right? A Testing Framework for Copyright Protection of Deep Learning Models

A novel testing framework for deep learning copyright protection: DEEPJUDGE quantitatively tests the similarities between two deep learning models: a victim model and a suspect model, which leverages a diverse set of testing metrics and efficient test case generation algorithms to produce a chain of supporting evidence to help determine whether a suspects model is a copy of the victim model.

Effectiveness of Distillation Attack and Countermeasure on Neural Network Watermarking

This paper shows that distillation, a widely used transformation technique, is a quite effective attack to remove watermark embedded by existing algorithms and design ingrain in response to the destructive distillation.

Protecting Intellectual Property of Deep Neural Networks with Watermarking

By extending the intrinsic generalization and memorization capabilities of deep neural networks, the models to learn specially crafted watermarks at training and activate with pre-specified predictions when observing the watermark patterns at inference, this paper generalizes the "digital watermarking'' concept from multimedia ownership verification to deep neural network (DNN) models.