Brief Announcement: Byzantine-Tolerant Machine Learning

  title={Brief Announcement: Byzantine-Tolerant Machine Learning},
  author={P. Blanchard and El Mahdi El Mhamdi and R. Guerraoui and J. Stainer},
  journal={Proceedings of the ACM Symposium on Principles of Distributed Computing},
We report on Krum, the first provably Byzantine-tolerant aggregation rule for distributed Stochastic Gradient Descent (SGD). Krum guarantees the convergence of SGD even in a distributed setting where (asymptotically) up to half of the workers can be malicious adversaries trying to attack the learning system. 
Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent
Krum is proposed, an aggregation rule that satisfies the resilience property of the aggregation rule capturing the basic requirements to guarantee convergence despite f Byzantine workers, which is argued to be the first provably Byzantine-resilient algorithm for distributed SGD. Expand
Simeon - Secure Federated Machine Learning Through Iterative Filtering
Simeon is a novel approach to aggregation that applies a reputation-based iterative filtering technique to achieve robustness even in the presence of attackers who can exhibit arbitrary behaviour and is tolerant to sybil attacks, where other algorithms are not. Expand
Robust Distributed Learning and Robust Learning Machines
Whether it occurs in artificial or biological substrates, learning is a distributed phenomenon in at least two aspects. First, meaningful data and experiences are rarely found in one location, henceExpand


Byzantine-Tolerant Machine Learning
This paper studies the robustness to Byzantine failures at the fundamental level of stochastic gradient descent (SGD), the heart of most machine learning algorithms, and proposes Krum, an update rule that satisfies the resilience property aforementioned. Expand
Deep learning with Elastic Averaging SGD
Experiments demonstrate that the new algorithm accelerates the training of deep architectures compared to DOWNPOUR and other common baseline approaches and furthermore is very communication efficient. Expand
Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization
An ergodic convergence rate is established for both asynchronous parallel implementations of stochastic gradient and it is proved that the linear speedup is achievable if the number of workers is bounded by $\sqrt{K}$ ($K$ is the total number of iterations). Expand
On-line learning and stochastic approximations
This framework encompasses the most common online learning algorithms in use today, as illustrated by several examples, and provides general results describing the convergence of all these learning algorithms at once. Expand
Multidimensional approximate agreement in Byzantine asynchronous systems
This paper generalizes the problem of ε-approximate agreement in Byzantine asynchronous systems to consider values that lie in Rm, for m ≥ 1, and presents an optimal protocol in regard to fault tolerance. Expand
Online Learning and Stochastic Approximations
The convergence of online learning algorithms is analyzed using the tools of the stochastic approximation theory, and proved under very weak conditions. A general framework for online learningExpand
Large-Scale Machine Learning with Stochastic Gradient Descent
A more precise analysis uncovers qualitatively different tradeoffs for the case of small-scale and large-scale learning problems. Expand
TensorFlow: A system for large-scale machine learning
The TensorFlow dataflow model is described and the compelling performance that Tensor Flow achieves for several real-world applications is demonstrated. Expand
The Byzantine Generals Problem
It is shown that, using only oral messages, the problem of a group of generals camped with their troops around an enemy city is solvable if and only if more than two-thirds of the generals are loyal; so a single traitor can confound two loyal generals. Expand
Implementing fault-tolerant services using the state machine approach: a tutorial
The state machine approach is a general method for implementing fault-tolerant services in distributed systems and protocols for two different failure models—Byzantine and fail stop are described. Expand