Fundamental Limits of Decentralized Data Shuffling

  title={Fundamental Limits of Decentralized Data Shuffling},
  author={Kai Wan and Daniela Tuninetti and Mingyue Ji and G. Caire and P. Piantanida},
  journal={IEEE Transactions on Information Theory},
Data shuffling of training data among different computing nodes (workers) has been identified as a core element to improve the statistical performance of modern large-scale machine learning algorithms. Data shuffling is often considered as one of the most significant bottlenecks in such systems due to the heavy communication load. Under a master-worker architecture (where a master has access to the entire dataset and only communication between the master and the workers is allowed) coding has… Expand
Analyzing the Interplay Between Random Shuffling and Storage Devices for Efficient Machine Learning
  • Zhi-Lin Ke, Hsiang-Yun Cheng, Chia-Lin Yang, Han-Wei Huang
  • Computer Science
  • 2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
  • 2021
An Umbrella Converse for Data Exchange: Applied to Caching, Computing, Shuffling & Rebalancing
Topological Coded Distributed Computing
On the Optimal Load-Memory Tradeoff of Cache-Aided Scalar Linear Function Retrieval
Coded Elastic Computing on Machines With Heterogeneous Storage and Computation Speed
Heterogeneous Computation Assignments in Coded Elastic Computing
On the Optimality of D2D Coded Caching With Uncoded Cache Placement and One-Shot Delivery
Device-to-Device Coded-Caching With Distinct Cache Sizes


Information Theoretic Limits of Data Shuffling for Distributed Learning
  • M. Attia, R. Tandon
  • Computer Science, Mathematics
  • 2016 IEEE Global Communications Conference (GLOBECOM)
  • 2016
Near Optimal Coded Data Shuffling for Distributed Learning
On the worst-case communication overhead for distributed data shuffling
  • M. Attia, R. Tandon
  • Computer Science, Mathematics
  • 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton)
  • 2016
On the Fundamental Limits of Coded Data Shuffling
Leveraging Coding Techniques for Speeding up Distributed Computing
UberShuffle: Communication-efficient Data Shuffling for SGD via Coding Theory
Coded Distributed Computing with Heterogeneous Function Assignments
A Scalable Framework for Wireless Distributed Computing
Cascaded Coded Distributed Computing on Heterogeneous Networks
A New Combinatorial Design of Coded Distributed Computing