Corpus ID: 59316742

Distributed Learning with Compressed Gradient Differences

@article{Mishchenko2019DistributedLW,
  title={Distributed Learning with Compressed Gradient Differences},
  author={Konstantin Mishchenko and Eduard A. Gorbunov and Martin Tak{\'a}c and Peter Richt{\'a}rik},
  journal={ArXiv},
  year={2019},
  volume={abs/1901.09269}
}
  • Konstantin Mishchenko, Eduard A. Gorbunov, +1 author Peter Richtárik
  • Published 2019
  • Mathematics, Computer Science
  • ArXiv
  • Training large machine learning models requires a distributed computing approach, with communication of the model updates being the bottleneck. For this reason, several methods based on the compression (e.g., sparsification and/or quantization) of updates were recently proposed, including QSGD (Alistarh et al., 2017), TernGrad (Wen et al., 2017), SignSGD (Bernstein et al., 2018), and DQGD (Khirirat et al., 2018). However, none of these methods are able to learn the gradients, which renders them… CONTINUE READING

    Figures, Tables, and Topics from this paper.

    Paper Mentions

    Citations

    Publications citing this paper.
    SHOWING 1-10 OF 27 CITATIONS

    Natural Compression for Distributed Deep Learning

    VIEW 16 EXCERPTS
    CITES METHODS, RESULTS & BACKGROUND

    On Stochastic Sign Descent Methods

    VIEW 1 EXCERPT
    CITES BACKGROUND

    Gradient Descent with Compressed Iterates

    VIEW 1 EXCERPT
    CITES METHODS

    FILTER CITATIONS BY YEAR

    2019
    2020

    CITATION STATISTICS

    • 3 Highly Influenced Citations

    • Averaged 14 Citations per year from 2019 through 2020

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 20 REFERENCES

    Adding vs. Averaging in Distributed Primal-Dual Optimization

    VIEW 2 EXCERPTS

    TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning

    VIEW 12 EXCERPTS
    HIGHLY INFLUENTIAL

    CoCoA: A General Framework for Communication-Efficient Distributed Optimization

    VIEW 2 EXCERPTS

    DiSCO: Distributed Optimization for Self-Concordant Empirical Loss

    • Yuchen Zhang, Xiao Lin
    • Computer Science
    • ICML
    • 2015
    VIEW 3 EXCERPTS