Corpus ID: 235391080

Synthetic Data -- Anonymisation Groundhog Day

@inproceedings{Stadler2020SyntheticD,
  title={Synthetic Data -- Anonymisation Groundhog Day},
  author={Theresa Stadler and Bristena Oprisanu and C. Troncoso},
  year={2020}
}
Synthetic data has been advertised as a silver-bullet solution to privacy-preserving data publishing that addresses the shortcomings of traditional anonymisation techniques. The promise is that synthetic data drawn from generative models preserves the statistical properties of the original dataset but, at the same time, provides perfect protection against privacy attacks. In this work, we present the first quantitative evaluation of the privacy gain of synthetic data publishing and compare it… Expand
A Review of Generative Adversarial Networks in Cancer Imaging: New Applications, New Solutions
TLDR
The potential of GANs to address a number of key challenges of cancer imaging, including data scarcity and imbalance, domain and dataset shifts, data access and privacy, data annotation and quantification, as well as cancer detection, tumour profiling and treatment planning are assessed. Expand
Effective and Privacy preserving Tabular Data Synthesizing
While data sharing is crucial for knowledge development, privacy concerns and strict regulation (e.g., European General Data Protection Regulation (GDPR)) unfortunately limit its full effectiveness.Expand
Generative Models for Security: Attacks, Defenses, and Opportunities
TLDR
The use of generative models in adversarial machine learning, in helping automate or enhance existing attacks, and as building blocks for defenses in contexts such as intrusion detection, biometrics spoofing, and malware obfuscation are discussed. Expand

References

SHOWING 1-10 OF 70 REFERENCES
Can You Fake It Until You Make It?: Impacts of Differentially Private Synthetic Data on Downstream Classification Fairness
TLDR
The results show that additional work improving the utility and fairness of DP generative models is required before they can be utilized as a potential solution to privacy and fairness issues stemming from lack of diversity in the training dataset. Expand
Growing synthetic data through differentially-private vine copulas
TLDR
The method COPULA-SHIRLEY is based on the differentially-private training of vine copulas, which are a family of copulas allowing to model and generate data of arbitrary dimensions, and can be applied to many types of data while preserving the utility. Expand
privGAN: Protecting GANs from membership inference attacks at low cost to utility
TLDR
A novel GAN architecture that can generate synthetic data in a privacy preserving manner with minimal hyperparameter tuning and architecture selection is proposed and a theoretical understanding of the optimal solution of the privGAN loss function is provided. Expand
A & e synthetic data
  • 2020
A Pragmatic Approach to Membership Inferences on Machine Learning Models
TLDR
This work revisits membership inference attacks from the perspective of a pragmatic adversary who carefully selects targets and make predictions conservatively and design a new evaluation methodology that allows to evaluate the membership privacy risk at the level of individuals and not only in aggregate. Expand
An Attack on InstaHide: Is Private Learning Possible with Instance Encoding?
TLDR
A reconstruction attack on InstaHide is presented that is able to use the encoded images to recover visually recognizable versions of the original images and proves barriers against achieving privacy through any learning protocol that uses instance encoding. Expand
Federal Office of Management and Budget, US Federal Office of Science and Technology
  • 2020
Really Useful Synthetic Data - A Framework to Evaluate the Quality of Differentially Private Synthetic Data
TLDR
A framework to evaluate the quality of differentially private synthetic data from an applied researcher's perspective and invites the academic community to jointly advance the privacy-quality frontier. Expand
Safe synthetic data -privacy, utility and control
  • 2020
Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference
TLDR
This work shows how a model's idiosyncratic use of features can provide evidence for membership to white-box attackers---even when the model's black-box behavior appears to generalize well---and demonstrates that this attack outperforms prior black- box methods. Expand
...
1
2
3
4
5
...