Collaborative deep learning across multiple data centers

  title={Collaborative deep learning across multiple data centers},
  author={Kele Xu and Haibo Mi and Dawei Feng and Huaimin Wang and Chuan Chen and Zibin Zheng and Xu Lan},
  journal={Science China Information Sciences},
  • Kele Xu, Haibo Mi, Xu Lan
  • Published 16 October 2018
  • Computer Science
  • Science China Information Sciences
Valuable training data is often owned by independent organizations and located in multiple data centers. Most deep learning approaches require to centralize the multi-datacenter data for performance purpose. In practice, however, it is often infeasible to transfer all data of different organizations to a centralized data center owing to the constraints of privacy regulations. It is very challenging to conduct the geo-distributed deep learning among data centers without the privacy leaks. Model… 
Precision-Weighted Federated Learning
Precision-weighted Federated Learning is proposed, a novel algorithm that takes into account the second raw moment (uncentered variance) of the stochastic gradient when computing the weighted average of the parameters of independent models trained in a Federated learning setting.
Multistructure-Based Collaborative Online Distillation
A cross-architecture online-distillation approach that uses the ensemble method to aggregate networks of different structures, thus forming better teachers than traditional distillation methods and achieves strong network-performance improvement.
Non-IID federated learning via random exchange of local feature maps for textile IIoT secure computing
A novel federated framework is proposed for secure textile identity management (FedTFI) via cross-domain texture representation based on high-definition fabric images to obtain better detection accuracies than benchmarks in four Non-IID scenarios by keeping data privacy for secure computing in fabric IIoT.
Squeeze-and-Excitation network-Based Radar Object Detection With Weighted Location Fusion
This work proposes a novel cross-modality deep learning framework for the radar object detection task using the Squeeze-and-Excitation network, aiming to provide more powerful feature representation and investigates the effectiveness of the proposed framework on the 2021 ICMR ROD challenge.
Direct Neuron-Wise Fusion of Cognate Neural Networks
The network created by fusing cognate neural networks showed consistent improvement on average compared with the commercialgrade domain-free network originating from the parent model and it is demonstrated that the fusion considering input connections to the neuron achieves the highest accuracy in experiments.
A bird’s-eye view of deep learning in bioimage analysis
  • E. Meijering
  • Computer Science, Biology
    Computational and structural biotechnology journal
  • 2020


Communication-Efficient Learning of Deep Networks from Decentralized Data
This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets.
Towards Geo-Distributed Machine Learning
It is shown that the current centralized practice can be far from optimal, and a system for doing geo-distributed training is proposed, which is structurally more amenable to dealing with regulatory constraints, as raw data never leaves the source data center.
Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds
A new, general geo-distributed ML system, Gaia, is introduced that decouples the communication within a data center from the communication between data centers, enabling different communication and consistency models for each.
Privacy-preserving deep learning
  • R. Shokri, Vitaly Shmatikov
  • Computer Science
    2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton)
  • 2015
This paper presents a practical system that enables multiple parties to jointly learn an accurate neural-network model for a given objective without sharing their input datasets, and exploits the fact that the optimization algorithms used in modern deep learning, namely, those based on stochastic gradient descent, can be parallelized and executed asynchronously.
Scalable and Fault Tolerant Platform for Distributed Learning on Private Medical Data
It is demonstrated that InsuLearn successfully integrates accurate models for horizontally partitioned data while preserving privacy, and the liveness of the system is guaranteed as institutions join and leave the network.
Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks
This paper introduces a new parallel training framework called Ensemble-Compression, denoted as EC-DNN, and proposes to aggregate the local models by ensemble, i.e., averaging the outputs of local models instead of the parameters.
Large scale distributed neural network training through online distillation
This paper claims that online distillation is a cost-effective way to make the exact predictions of a model dramatically more reproducible and can still speed up training even after the authors have already reached the point at which additional parallelism provides no benefit for synchronous or asynchronous stochastic gradient descent.
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
This paper finds 99.9% of the gradient exchange in distributed SGD is redundant, and proposes Deep Gradient Compression (DGC) to greatly reduce the communication bandwidth, which enables large-scale distributed training on inexpensive commodity 1Gbps Ethernet and facilitates distributedTraining on mobile.
Federated Machine Learning
This work introduces a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federatedLearning, and federated transfer learning, and provides a comprehensive survey of existing works on this subject.
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.