Corpus ID: 236088102

An Experimental Study of Data Heterogeneity in Federated Learning Methods for Medical Imaging

@article{Qu2021AnES,
  title={An Experimental Study of Data Heterogeneity in Federated Learning Methods for Medical Imaging},
  author={Liangqiong Qu and Niranjan Balachandar and D. Rubin},
  journal={ArXiv},
  year={2021},
  volume={abs/2107.08371}
}
Federated learning enables multiple institutions to collaboratively train machine learning models on their local data in a privacy-preserving way. However, its distributed nature often leads to significant heterogeneity in data distributions across institutions. In this paper, we investigate the deleterious impact of a taxonomy of data heterogeneity regimes on federated learning methods, including quantity skew, label distribution skew, and imaging acquisition skew. We show that the performance… Expand

Figures from this paper

References

SHOWING 1-10 OF 23 REFERENCES
Accounting for data variability in multi-institutional distributed deep learning for medical imaging
TLDR
This work is the first to identify and address challenges of sample size and label distribution variability in simulated distributed deep learning for medical imaging and improve CWT's capability of handling data variability across institutions. Expand
Split learning for health: Distributed deep learning without sharing raw patient data
TLDR
This paper compares performance and resource efficiency trade-offs of splitNN and other distributed deep learning methods like federated learning, large batch synchronous stochastic gradient descent and show highly encouraging results for splitNN. Expand
Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification
TLDR
This work proposes a way to synthesize datasets with a continuous range of identicalness and provide performance measures for the Federated Averaging algorithm, and shows that performance degrades as distributions differ more, and proposes a mitigation strategy via server momentum. Expand
Distributed deep learning networks among institutions for medical imaging
TLDR
It is shown that distributing deep learning models is an effective alternative to sharing patient data, and this finding has implications for any collaborative deep learning study. Expand
Communication-Efficient Learning of Deep Networks from Decentralized Data
TLDR
This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets. Expand
Predicting cancer outcomes from histology and genomics using convolutional networks
TLDR
A computational approach based on deep learning to predict the overall survival of patients diagnosed with brain tumors from microscopic images of tissue biopsies and genomic biomarkers, which surpasses the prognostic accuracy of human experts using the current clinical standard for classifying brain tumors. Expand
The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository
TLDR
The management tasks and user support model for TCIA is described, an open-source, open-access information resource to support research, development, and educational initiatives utilizing advanced medical imaging of cancer. Expand
Wavelet-based Semi-supervised Adversarial Learning for Synthesizing Realistic 7T from 3T MRI
TLDR
A novel wavelet-based semi-supervised adversarial learning framework to synthesize 7T MR images from their 3T counterparts through a cycle generative adversarial network that operates in the joint spatial-wavelet domain for the synthesis of multi-frequency details. Expand
Radiomics: the bridge between medical imaging and personalized medicine
Radiomics, the high-throughput mining of quantitative image features from standard-of-care medical imaging that enables data to be extracted and applied within clinical-decision support systems toExpand
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
TLDR
This paper finds 99.9% of the gradient exchange in distributed SGD is redundant, and proposes Deep Gradient Compression (DGC) to greatly reduce the communication bandwidth, which enables large-scale distributed training on inexpensive commodity 1Gbps Ethernet and facilitates distributedTraining on mobile. Expand
...
1
2
3
...