Learn More
We study the problem of exploration in stochastic Multi-Armed Bandits. Even in the simplest setting of identifying the best arm, there remains a logarithmic multiplica-tive gap between the known lower and upper bounds for the number of arm pulls required for the task. This extra logarithmic factor is quite meaningful in nowadays large-scale applications. We(More)
We consider the problem of prediction with expert advice when the losses of the experts have low-dimensional structure: they are restricted to an unknown d-dimensional subspace. We devise algorithms with regret bounds that are independent of the number of experts and depend only on the rank d. For the stochastic model we show a tight bound of Θp ? dT q, and(More)
This document describes a method to improve the bandwidth utilization of RTP streams over network paths that carry multiple Real-time Transport Protocol (RTP) streams in parallel between two endpoints, as in voice trunking. The method combines standard protocols that provide compression, multiplexing, and tunneling over a network path for the purpose of(More)
We consider the fundamental problem of maximizing a general quadratic function over an ellipsoidal domain, also known as the trust region problem. We give the first provable linear-time (in the number of non-zero entries of the input) algorithm for approximately solving this problem. Specifically, our algorithm returns an ǫ-approximate solution in time˜O(N/(More)
Robust optimization is a common framework in optimization under uncertainty when the problem parameters are not known, but it is rather known that the parameters belong to some given uncertainty set. In the robust optimization framework the problem solved is a min-max problem where a solution is judged according to its performance on the worst possible(More)
Stochastic exp-concave optimization is an important primitive in machine learning that captures several fundamental problems, including linear regression, logistic regression and more. The exp-concavity property allows for fast convergence rates, as compared to general stochastic optimization. However, current algorithms that attain such rates scale poorly(More)
Bandit convex optimization is one of the fundamental problems in the field of online learning. The best algorithm for the general bandit convex optimization problem guarantees a regret of e O(T 5/6), while the best known lower bound is ⌦(T 1/2). Many attempts have been made to bridge the huge gap between these bounds. A particularly interesting special case(More)
We study a general class of online learning problems where the feedback is specified by a graph. This class includes online prediction with expert advice and the multi-armed bandit problem, but also several learning problems where the online player does not necessarily observe his own loss. We analyze how the structure of the feedback graph controls the(More)