Learn More
This paper presents a new incremental gradient algorithm for minimizing the average of a large number of smooth component functions based on delayed partial gradients. Even with a constant step size, which can be chosen independently of the maximum delay bound and the number of objective function components, the expected objective value is guaranteed to(More)
Mini-batch optimization has proven to be a powerful paradigm for large-scale learning. However, the state of the art parallel mini-batch algorithms assume synchronous operation or cyclic update orders. When worker nodes are heterogeneous (due to different computational capabilities or different communication delays), synchronous and cyclic operations are(More)
  • 1