Corpus ID: 220381291

Coded Computing for Federated Learning at the Edge

  title={Coded Computing for Federated Learning at the Edge},
  author={Saurav Prakash and Sagar Dhakal and Mustafa Riza Akdeniz and Amir Salman Avestimehr and Nageen Himayat},
Federated Learning (FL) is an exciting new paradigm that enables training a global model from data generated locally at the client nodes, without moving client data to a centralized server. Performance of FL in a multi-access edge computing (MEC) network suffers from slow convergence due to heterogeneity and stochastic fluctuations in compute power and communication link qualities across clients. A recent work, Coded Federated Learning (CFL), proposes to mitigate stragglers and speed up… Expand

Figures and Tables from this paper

Coded Computing for Low-Latency Federated Learning Over Wireless Edge Networks
This work proposes a novel coded computing framework, CodedFedL, that injects structured coding redundancy into federated learning for mitigating stragglers and speeding up the training procedure. Expand
A Survey of Coded Distributed Computing
A number of CDC approaches proposed to reduce the communication costs, mitigate the straggler effects, and guarantee privacy and security are reviewed and analyzed. Expand
Distributed Learning Applications in Power Systems: A Review of Methods, Gaps, and Challenges
This paper summarizes the methods, benefits, and challenges of distributed learning frameworks in power systems and identifies the gaps in the literature for future studies. Expand
6G: Connectivity in the Era of Distributed Intelligence
This paper poses pervasive distributed intelligence as a (sub) vision for 6G, and presents how joint innovations in AI, compute and networking will be necessary to achieve it. Expand
FedML: A Research Library and Benchmark for Federated Machine Learning
FedML is introduced, an open research library and benchmark that facilitates the development of new federated learning algorithms and fair performance comparisons and can provide an efficient and reproducible means of developing and evaluating algorithms for the Federated learning research community. Expand


Federated Learning with Non-IID Data
This work presents a strategy to improve training on non-IID data by creating a small subset of data which is globally shared between all the edge devices, and shows that accuracy can be increased by 30% for the CIFAR-10 dataset with only 5% globally shared data. Expand
Federated Learning: Strategies for Improving Communication Efficiency
Two ways to reduce the uplink communication costs are proposed: structured updates, where the user directly learns an update from a restricted space parametrized using a smaller number of variables, e.g. either low-rank or a random mask; and sketched updates, which learn a full model update and then compress it using a combination of quantization, random rotations, and subsampling. Expand
Coded Computing for Distributed Machine Learning in Wireless Edge Network
A coded computation framework, which utilizes statistical knowledge of resource heterogeneity to determine optimal encoding and load balancing of training data using Random Linear codes, while avoiding an explicit step for decoding gradients is proposed. Expand
Communication-Efficient Learning of Deep Networks from Decentralized Data
This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets. Expand
Redundancy Techniques for Straggler Mitigation in Distributed Optimization and Learning
This work proposes a distributed optimization framework where the dataset is "encoded" to have an over-complete representation with built-in redundancy, and the straggling nodes in the system are dynamically left out of the computation at every iteration, whose loss is compensated by the embedded redundancy. Expand
Joint Communication, Computation, Caching, and Control in Big Data Multi-Access Edge Computing
The problem of joint 4C in big data MEC is formulated as an optimization problem whose goal is to jointly optimize a linear combination of the bandwidth consumption and network latency, but the formulated problem is shown to be non-convex. Expand
Communication-Computation Efficient Gradient Coding
This paper develops coding techniques to reduce the running time of distributed learning tasks by giving an explicit coding scheme that achieves the optimal tradeoff based on recursive polynomial constructions, coding both across data subsets and vector components. Expand
Coded computation over heterogeneous clusters
This paper proposes Heterogeneous Coded Matrix Multiplication (HCMM) algorithm for performing distributed matrix multiplication over heterogeneous clusters that is provably asymptotically optimal and provides numerical results demonstrating significant speedups of up to 49% and 34% for HCMM in comparison to the “uncoded” and “homogeneous coded” schemes. Expand
Speeding Up Distributed Machine Learning Using Codes
This paper focuses on two of the most basic building blocks of distributed learning algorithms: matrix multiplication and data shuffling, and uses codes to reduce communication bottlenecks, exploiting the excess in storage. Expand
Straggler Mitigation in Distributed Optimization Through Data Encoding
This paper proposes several encoding schemes, and demonstrates that popular batch algorithms, such as gradient descent and L-BFGS, applied in a coding-oblivious manner, deterministically achieve sample path linear convergence to an approximate solution of the original problem, using an arbitrarily varying subset of the nodes at each iteration. Expand