• Corpus ID: 54444482

Privacy-Preserving Distributed Deep Learning for Clinical Data

  title={Privacy-Preserving Distributed Deep Learning for Clinical Data},
  author={Brett K. Beaulieu-Jones and William Yuan and Samuel G. Finlayson and Zhiwei Steven Wu},
Deep learning with medical data often requires larger samples sizes than are available at single providers. While data sharing among institutions is desirable to train more accurate and sophisticated models, it can lead to severe privacy concerns due the sensitive nature of the data. This problem has motivated a number of studies on distributed training of neural networks that do not require direct sharing of the training data. However, simple distributed training does not offer provable… 

A Federated Learning Framework for Privacy-preserving and Parallel Training

The developed FEDF framework allows a model to be learned on multiple geographically-distributed training datasets while do not reveal any information of each dataset as well as the intermediate results, and it is formally proved the convergence of the learning model when training with the developed framework and its privacy-preserving property.

Privacy-Preserving Deep Learning Models for Law Big Data Feature Learning

  • Xu YuanJianing ZhangZhikui ChenJing GaoPeng Li
  • Computer Science
    2019 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech)
  • 2019
The emerging topics of deep learning for the feature learning of the privacy data are reviewed and the problems and the future trend inDeep learning for privacy-preserving feature learning on law data are discussed.

Federated and Differentially Private Learning for Electronic Health Records

It is found that while it is straightforward to apply differentially private stochastic gradient descent to achieve strong privacy bounds when training in a centralized setting, it is considerably more difficult to do so in the federated setting.

Challenges of Differentially Private Prediction in Healthcare Settings

This work demonstrates that due to the long-tailed nature of healthcare data, learning with differential privacy results in poor utility tradeoffs and highlights important implications of differentially private learning; which focuses by design on learning the body of a distribution to protect privacy but omits important information contained in the tails of Healthcare data distributions.

Chasing Your Long Tails: Differentially Private Prediction in Health Care Settings

This paper uses state-of-the-art methods for DP learning to train privacy-preserving models in clinical prediction tasks, including x-ray classification of images and mortality prediction in time series data, and uses these models to perform a comprehensive empirical investigation of the tradeoffs between privacy, utility, robustness to dataset shift and fairness.

Privacy in Deep Learning: A Survey

This survey reviews the privacy concerns brought by deep learning, and the mitigating techniques introduced to tackle these issues, and shows that there is a gap in the literature regarding test-time inference privacy.

Recent Developments in Privacy-preserving Mining of Clinical Data

Looking at dominant techniques and recent innovations to them, the applicability of these methods to the privacy-preserving analysis of clinical data is examined and promising directions for future research in this area are discussed.

Privacy-Aware Distributed Graph-Based Semi-Supervised Learning

This paper proposes a privacy-aware framework for distributed semi-supervised learning, where the training data is distributed among multiple data-owners, who wish to protect the privacy of their individual datasets from the other parties during training.



Privacy-preserving deep learning

  • R. ShokriVitaly Shmatikov
  • Computer Science
    2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton)
  • 2015
This paper presents a practical system that enables multiple parties to jointly learn an accurate neural-network model for a given objective without sharing their input datasets, and exploits the fact that the optimization algorithms used in modern deep learning, namely, those based on stochastic gradient descent, can be parallelized and executed asynchronously.

Deep Learning with Differential Privacy

This work develops new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy, and demonstrates that deep neural networks can be trained with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.

Distributed deep learning networks among institutions for medical imaging

It is shown that distributing deep learning models is an effective alternative to sharing patient data, and this finding has implications for any collaborative deep learning study.

Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures

A new class of model inversion attack is developed that exploits confidence values revealed along with predictions and is able to estimate whether a respondent in a lifestyle survey admitted to cheating on their significant other and recover recognizable images of people's faces given only their name.

Opportunities and obstacles for deep learning in biology and medicine

This work examines applications of deep learning to a variety of biomedical problems -- patient classification, fundamental biological processes, and treatment of patients -- to predict whether deep learning will transform these tasks or if the biomedical sphere poses unique challenges.

Membership Inference Attacks Against Machine Learning Models

This work quantitatively investigates how machine learning models leak information about the individual data records on which they were trained and empirically evaluates the inference techniques on classification models trained by commercial "machine learning as a service" providers such as Google and Amazon.

Calibrating Noise to Sensitivity in Private Data Analysis

The study is extended to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f, which is the amount that any single argument to f can change its output.

Robust De-anonymization of Large Sparse Datasets

This work applies the de-anonymization methodology to the Netflix Prize dataset, which contains anonymous movie ratings of 500,000 subscribers of Netflix, the world's largest online movie rental service, and demonstrates that an adversary who knows only a little bit about an individual subscriber can easily identify this subscriber's record in the dataset.

Rényi Differential Privacy

  • Ilya Mironov
  • Computer Science
    2017 IEEE 30th Computer Security Foundations Symposium (CSF)
  • 2017
This work argues that the useful analytical tool can be used as a privacy definition, compactly and accurately representing guarantees on the tails of the privacy loss, and demonstrates that the new definition shares many important properties with the standard definition of differential privacy.

A Systematic Review of Re-Identification Attacks on Health Data

The current evidence shows a high re-identification rate but is dominated by small-scale studies on data that was not de-identified according to existing standards, and evidence is insufficient to draw conclusions about the efficacy of de-Identification methods.