• Corpus ID: 1453718

Distributed Non-Stochastic Experts

@inproceedings{Kanade2012DistributedNE,
  title={Distributed Non-Stochastic Experts},
  author={Varun Kanade and Zhenming Liu and Bozidar Radunovic},
  booktitle={NIPS},
  year={2012}
}
We consider the online distributed non-stochastic experts problem, where the distributed system consists of one coordinator node that is connected to k sites, and the sites are required to communicate with each other via the coordinator. At each time-step t, one of the k site nodes has to pick an expert from the set {1, ..., n}, and the same site receives information about payoffs of all experts for that round. The goal of the distributed system is to minimize regret at time horizon T, while… 

Figures and Tables from this paper

Information sharing in distributed stochastic bandits
TLDR
This work opens a novel direction towards understanding information sharing for active learning in a distributed environment by specifying a policy that achieves the optimal regret with a logarithmic communication cost for Bernoulli arrivals.
Communication Efficient Parallel Reinforcement Learning
TLDR
The problem where M agents interact with M identical and independent environments with S states and A actions using reinforcement learning for T rounds is considered and an algorithm that allows the agents to minimize the regret with infrequent communication rounds is proposed.
Regret vs. Communication: Distributed Stochastic Multi-Armed Bandits and Beyond
TLDR
This paper considers the distributed stochastic multi-armed bandit problem, where a global arm set can be accessed by multiple players independently and proposes the Over-Exploration strategy, which only requires one-round communication and whose regret does not scale with the number of players.
Social Learning in Multi Agent Multi Armed Bandits
TLDR
A novel algorithm in which agents, whenever they choose, communicate only arm-ids and not samples, with another agent chosen uniformly and independently at random is developed, demonstrating that even a minimal level of collaboration among the different agents enables a significant reduction in per-agent regret.
Decentralized Exploration in Multi-Armed Bandits
TLDR
A generic algorithm Decentralized Elimination is provided, which uses any best arm identification algorithm as a subroutine, and it is proved that this algorithm insures privacy, with a low communication cost, and that in comparison to the lower bound of thebest arm identification problem, its sample complexity suffers from a penalty depending on the inverse of the probability of the most frequent players.
The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits
TLDR
It is demonstrated that even a minimal level of collaboration among agents greatly reduces regret for all agents, and a lower bound is shown which gives that the regret scaling obtained by the algorithm cannot be improved even in the absence of any communication constraints.
Collaborative learning of stochastic bandits over a social network
TLDR
A key finding of this paper is that natural extensions of widely-studied single agent learning policies to the network setting need not perform well in terms of regret.
Multi-Agent Multi-Armed Bandits with Limited Communication
TLDR
Limited Communication Collaboration Upper Confidence Bound (LCC-UCB), a doubling-epoch based algorithm where each agent communicates only after the end of the epoch and shares the index of the best arm it knows is presented.
Collaborative Learning of Stochastic Bandits Over a Social Network
TLDR
A key finding of this paper is that natural extensions of widely studied single agent learning policies to the network setting need not perform well in terms of regret.
Cooperative Stochastic Multi-agent Multi-armed Bandits Robust to Adversarial Corruptions
TLDR
This work proposes a new algorithm that not only achieves near-optimal regret in the stochastic setting, but also obtains a regret with an additive term of corruption in the corrupted setting, while maintaining efficient communication.
...
...

References

SHOWING 1-10 OF 21 REFERENCES
Distributed delayed stochastic optimization
TLDR
This work shows n-node architectures whose optimization error in stochastic problems-in spite of asynchronous delays-scales asymptotically as O(1/√nT) after T iterations, known to be optimal for a distributed system with n nodes even in the absence of delays.
Continuous distributed counting for non-monotonic streams
TLDR
It is shown that a randomized algorithm guarantees to track the count accurately with high probability and has the expected communication cost Õ(min√k/(|#956;|ε), √k n/ε, n}), for an input stream of length n, and establish matching lower bounds.
Efficient Algorithms for Online Decision Problems
Optimal Distributed Online Prediction
TLDR
The distributed mini-batch (DMB) framework is presented, a method of converting a serial gradient-based online algorithm into a distributed algorithm, and an asymptotically optimal regret bound is proved for smooth convex loss functions and stochastic examples.
Distributed Dual Averaging In Networks
TLDR
This work develops and analyzes distributed algorithms based on dual averaging of subgradients, and provides sharp bounds on their convergence rates as a function of the network size and topology.
Protocols for Distributed Classication and Optimization
TLDR
The techniques make use of a novel connection from multipass streaming, as well as adapting the multiplicative-weight-update framework more generally to a distributed setting, and extend to the wide range of problems solvable using these techniques.
Randomized algorithms for tracking distributed count, frequencies, and ranks
TLDR
It is shown that randomization can lead to significant improvements for a few fundamental problems in distributed tracking, and techniques are extended to two related distributed tracking problems: frequency-tracking and rank-tracking, and obtain similar improvements over previous deterministic algorithms.
Efficient Protocols for Distributed Classification and Optimization
TLDR
This work develops a two-party multiplicative-weight-update based protocol that uses O(d2 log1/e) words of communication to classify distributed data in arbitrary dimension d, e-optimally and shows how to solve fixed-dimensional and high-dimensional linear programming with small communication in a distributed setting where constraints may be distributed across nodes.
Universal Portfolios
We exhibit an algorithm for portfolio selection that asymptotically outperforms the best stock in the market. Let x i = (x i1 ; x i2 ; : : : ; x im) t denote the performance of the stock market on
Distributed Learning, Communication Complexity and Privacy
TLDR
General upper and lower bounds on the amount of communication needed to learn well are provided, showing that in addition to VC- dimension and covering number, quantities such as the teaching-dimension and mistake-bound of a class play an important role.
...
...