• Corpus ID: 238856649

Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing

  title={Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing},
  author={Sai Praneeth Karimireddy and Lie He and Martin Jaggi},
In Byzantine robust distributed or federated learning, a central server wants to train a machine learning model over data distributed across multiple workers. However, a fraction of these workers may deviate from the prescribed algorithm and send arbitrary messages. While this problem has received significant attention recently, most current defenses assume that the workers have identical data. For realistic cases when the data across workers are heterogeneous (non-iid), we design new attacks… 
1 Citations
Strategyproof Learning: Building Trustworthy User-Generated Datasets
This paper proposes the first personalized collaborative learning framework, LICCHAVI, with provable strategyproofness guarantees through a careful design of the underlying loss function, and proves that LICCH AVI is Byzantine resilient: it tolerates a minority of users that provide arbitrary data.


DRACO: Byzantine-resilient Distributed Training via Redundant Gradients
DRACO is presented, a scalable framework for robust distributed training that uses ideas from coding theory and comes with problem-independent robustness guarantees, and is shown to be several times, to orders of magnitude faster than median-based approaches.
RSA: Byzantine-Robust Stochastic Aggregation Methods for Distributed Learning from Heterogeneous Datasets
This paper shows that RSA converges to a near-optimal solution with the learning error dependent on the number of Byzantine workers, and shows that the convergence rate of RSA under Byzantine attacks is the same as that of the stochastic gradient descent method, which is free of Byzantine attacks.
Byzantine-Robust Decentralized Learning via Self-Centered Clipping
A Self-Centered Clipping (SCClip) algorithm for Byzantine-robust consensus and optimization, which is the first to provably converge to a $O(\delta_{\max}\zeta^2/\gamma^2)$ neighborhood of the stationary point for non-convex objectives under standard assumptions.
Byzantine-Resilient SGD in High Dimensions on Heterogeneous Data
At the core of the algorithm, the polynomial-time outlier-filtering procedure for robust mean estimation proposed by Steinhardt et al. (ITCS 2018) to filter-out corrupt gradients is used to give a trade-off between the mini-batch size for stochastic gradients and the approximation error.
Robust Federated Learning in a Heterogeneous Environment
A general statistical model is proposed which takes both the cluster structure of the users and the Byzantine machines into account and proves statistical guarantees for an outlier-robust clustering algorithm, which can be considered as the Lloyd algorithm with robust estimation.
On the Byzantine Robustness of Clustered Federated Learning
This work investigates the application of CFL to byzantine settings, where a subset of clients behaves unpredictably or tries to disturb the joint training effort in an directed or undirected way, and demonstrates that CFL (without modifications) is able to reliably detect byZantine clients and remove them from training.
Byzantine-Resilient High-Dimensional SGD with Local Iterations on Heterogeneous Data
This work believes that its is the first Byzantine-resilient algorithm and analysis with local iterations in the presence of malicious/Byzantine clients and derives convergence results under minimal assumptions of bounded variance for SGD and bounded gradient dissimilarity in the statistical heterogeneous data setting.
AGGREGATHOR: Byzantine Machine Learning via Robust Gradient Aggregation
A framework that implements state-of-the-art robust (Byzantine-resilient) distributed stochastic gradient descent and quantifies the overhead of Byzantine resilience of AGGREGATHOR to 19% and 43% compared to vanilla TensorFlow.
Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent
Krum is proposed, an aggregation rule that satisfies the resilience property of the aggregation rule capturing the basic requirements to guarantee convergence despite f Byzantine workers, which is argued to be the first provably Byzantine-resilient algorithm for distributed SGD.
Collaborative Learning in the Jungle (Decentralized, Byzantine, Heterogeneous, Asynchronous and Nonconvex Learning)
It is proved that collaborative learning is equivalent to a new form of agreement, which is called averaging agreement, and new impossibility theorems on what any collaborative learning algorithm can achieve in adversarial and heterogeneous environments are yielded.