How reparametrization trick broke differentially-private text representation learning

@article{Habernal2022HowRT,
  title={How reparametrization trick broke differentially-private text representation learning},
  author={Ivan Habernal},
  journal={ArXiv},
  year={2022},
  volume={abs/2202.12138}
}
As privacy gains traction in the NLP community, researchers have started adopting various approaches to privacy-preserving methods. One of the favorite privacy frameworks, differential privacy (DP), is perhaps the most compelling thanks to its fundamental theoretical guarantees. Despite the apparent simplicity of the general concept of differential privacy, it seems non-trivial to get it right when applying it to NLP. In this short paper, we formally analyze several recent NLP papers proposing… 

Figures from this paper

References

SHOWING 1-10 OF 25 REFERENCES
When differential privacy meets NLP: The devil is in the detail
TLDR
If differential privacy applications in NLP rely on formal guarantees, these should be outlined in full and put under detailed scrutiny, to reveal that ADePT is not differentially private, thus rendering the experimental results unsubstantiated.
ADePT: Auto-encoder based Differentially Private Text Transformation
TLDR
This paper transforms text to offer robustness against attacks and produces transformations with high semantic quality that perform well on downstream NLP tasks and proves the algorithm’s theoretical privacy guarantee and assess its privacy leakage under Membership Inference Attacks (MIA) on models trained with transformed data.
Privacy-Preserving Graph Convolutional Networks for Text Classification
TLDR
This work proposes a simple yet effective method based on random graph splits that improves the baseline privacy bounds by a factor of 2.7 while retaining competitive F 1 scores, but also provides strong privacy guarantees of ε = 1 .
One size does not fit all: Investigating strategies for differentially-private learning across NLP tasks
TLDR
It is shown that unlike standard non-private approaches to solving NLP tasks, where bigger is usually better, privacy-preserving strategies do not exhibit a winning pattern, and each task and privacy regime requires a special treatment to achieve adequate performance.
I Am Not What I Write: Privacy Preserving Text Representation Learning
TLDR
A novel double privacy preserving text representation learning framework, DPText, is proposed, which learns a textual representation that is differentially private, does not contain private information and retains high utility for the given task.
The Algorithmic Foundations of Differential Privacy
TLDR
The preponderance of this monograph is devoted to fundamental techniques for achieving differential privacy, and application of these techniques in creative combinations, using the query-release problem as an ongoing example.
Privacy Preserving Text Representation Learning
TLDR
This paper proposes a novel double privacy preserving text representation learning framework, DPText, and shows the effectiveness of DPText in preserving privacy and utility.
Towards Robust and Privacy-preserving Text Representations
TLDR
This paper proposes an approach to explicitly obscure important author characteristics at training time, such that representations learned are invariant to these attributes, which leads to increased privacy in the learned representations.
Rényi Differential Privacy
  • Ilya Mironov
  • Computer Science
    2017 IEEE 30th Computer Security Foundations Symposium (CSF)
  • 2017
TLDR
This work argues that the useful analytical tool can be used as a privacy definition, compactly and accurately representing guarantees on the tails of the privacy loss, and demonstrates that the new definition shares many important properties with the standard definition of differential privacy.
Anonymisation Models for Text Data: State of the art, Challenges and Future Directions
TLDR
A case is laid out for moving beyond sequence labelling models and incorporate explicit measures of disclosure risk into the text anonymisation process, and how to evaluate the quality of the resulting anonymisation.
...
...