Corpus ID: 229153349

Better scalability under potentially heavy-tailed feedback

@article{Holland2020BetterSU,
  title={Better scalability under potentially heavy-tailed feedback},
  author={Matthew J. Holland},
  journal={ArXiv},
  year={2020},
  volume={abs/2012.07346}
}
  • Matthew J. Holland
  • Published 2020
  • Computer Science, Mathematics
  • ArXiv
  • We study scalable alternatives to robust gradient descent (RGD) techniques that can be used when the losses and/or gradients can be heavy-tailed, though this will be unknown to the learner. The core technique is simple: instead of trying to robustly aggregate gradients at each step, which is costly and leads to sub-optimal dimension dependence in risk bounds, we instead focus computational effort on robustly choosing (or newly constructing) a strong candidate based on a collection of cheap… CONTINUE READING

    References

    SHOWING 1-10 OF 47 REFERENCES
    Robust descent using smoothed multiplicative noise
    • 13
    • PDF
    Efficient learning with robust gradient descent
    • 16
    • PDF
    Better generalization with less data using robust gradient descent
    • 8
    • PDF
    Robust regression using biased objectives
    • 8
    • PDF
    SGD and Hogwild! Convergence Without the Bounded Gradients Assumption
    • 82
    • PDF
    Optimal Learning for Multi-pass Stochastic Gradient Methods
    • 27
    • PDF
    Optimization Methods for Large-Scale Machine Learning
    • 1,209
    • PDF
    Loss Minimization and Parameter Estimation with Heavy Tails
    • 107
    • Highly Influential
    • PDF