Communication-Aware Collaborative Learning

@inproceedings{Blum2020CommunicationAwareCL,
  title={Communication-Aware Collaborative Learning},
  author={Avrim Blum and Shelby Heinecke and L. Reyzin},
  booktitle={AAAI Conference on Artificial Intelligence},
  year={2020}
}
Algorithms for noiseless collaborative PAC learning have been analyzed and optimized in recent years with respect to sample complexity. In this paper, we study collaborative PAC learning with the goal of reducing communication cost at essentially no penalty to the sample complexity. We develop communication efficient collaborative PAC learning algorithms using distributed boosting. We then consider the communication cost of collaborative learning in the presence of classification noise. As an… 

Tables from this paper

On-Demand Sampling: Learning Optimally from Multiple Distributions

The optimal sample complexity of multi-distribution learning paradigms, such as collaborative, group distributionally robust, and fair federated learning are established and algorithms that meet this sample complexity are given.

References

SHOWING 1-10 OF 13 REFERENCES

Collaborative PAC Learning

A collaborative PAC learning model, in which k players attempt to learn the same underlying concept, with an Omega(ln(k)) overhead lower bound, showing that the results are tight up to a logarithmic factor.

Communication Efficient Distributed Agnostic Boosting

This work proposes a general distributed boosting-based procedure for learning an arbitrary concept space, that is simultaneously noise tolerant, communication efficient, and computationally efficient.

Improved Algorithms for Collaborative PAC Learning

New algorithms for both the realizable and the non-realizable setting are designed, having sample complexity only $O(\ln (k))$ times the worst-case sample complexity for learning a single task.

Tight Bounds for Collaborative PAC Learning via Multiplicative Weights

A collaborative learning algorithm with overhead is obtained, improving the one with overhead in BHPQ17 and it is shown that an $\Omega(\ln k)$ overhead is inevitable when $k$ is polynomial bounded by the VC dimension of the hypothesis class.

PAC-Learning with General Class Noise Models

A framework for class noise is introduced, in which most of the known class noise models for the PAC setting can be formulated, and the Empirical Risk Minimization strategy is generalized to a more powerful strategy.

Do Outliers Ruin Collaboration?

An algorithm is presented that achieves an $O(\eta n + \ln n)$ overhead, which is proved to be worst-case optimal, and the potential challenges to the design of a computationally efficient learning algorithm with a small overhead are discussed.

Distributed Learning, Communication Complexity and Privacy

General upper and lower bounds on the amount of communication needed to learn well are provided, showing that in addition to VC- dimension and covering number, quantities such as the teaching-dimension and mistake-bound of a class play an important role.

Learning From Noisy Examples

This paper shows that when the teacher may make independent random errors in classifying the example data, the strategy of selecting the most consistent rule for the sample is sufficient, and usually requires a feasibly small number of examples, provided noise affects less than half the examples on average.

Boosting in the presence of noise

A variant of the standard scenario for boosting in which the "weak learner" satisfies a slightly stronger condition than the usual weak learning guarantee is considered, and an efficient algorithm is given which can boost to arbitrarily high accuracy in the presence of classification noise.

A decision-theoretic generalization of on-line learning and an application to boosting

The model studied can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting, and it is shown that the multiplicative weight-update Littlestone?Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems.