We present the distributed mini-batch algorithm, a method of converting many serial gradient-based online prediction algorithms into distributed algorithms, achieving asymptotically linear speed-up over multiple processors.Expand

We present a novel Newton-type method for distributed optimization, which is particularly well suited for stochastic optimization and learning problems.Expand

We consider a variant of the stochastic multi-armed bandit problem, where multiple players simultaneously choose from the same set of arms and may collide, receiving no reward.Expand

We study the sample complexity of learning neural networks, by providing new bounds on their Rademacher complexity assuming norm constraints on the parameter matrix of each layer.Expand

We introduce a new scale-invariant Kernel approximation model that, given n objects, learns a similarity matrix over all n2 pairs, from crowdsourced data alone.Expand

We study stochastic convex optimization, and uncover a surprisingly different situation in the more general setting: although the problem is learnable (e.g. using online-to-batch conversions), no uniform convergence holds.Expand

We show that there is a simple (approximately radial) function on $\reals^d$, expressible by a small 3-layer feedforward neural networks, which cannot be approximated by any 2-layer network, to more than a certain constant accuracy, unless its width is exponential in the dimension.Expand