Share This Author
A Closer Look at the Worst-case Behavior of Multi-armed Bandit Algorithms
It is shown that arm-sampling rates under UCB are asymptotically deterministic, regardless of the problem complexity, and the first complete process-level characterization of the MAB problem underUCB in the conventional diffusion scaling is provided.
From Finite to Countable-Armed Bandits
A fully adaptive online learning algorithm is proposed that achieves O (log n) distribution-dependent expected cumulative regret after any number of plays n, and it is shown that this order of regret is best possible.
Stochastic approximation algorithms for rumor source inference on graphs
Capacity expansion of neutral ISPs via content provider participation: The bargaining edge
Capacity Expansion of Neutral ISPs via Content Peering Charges: The Bargaining Edge
This paper considers the scenario where CPs peer with an ISP and take the lead in paying peering charges with the caveat that this has to be used for capacity expansion and shows that the bargaining model leads to a higher investment in the ISP infrastructure than even the cooperative model.
The Countable-armed Bandit with Vanishing Arms
It is characterized necessary and sufficient conditions for achievability of sub-linear regret in terms of a critical vanishing rate of optimal arms, and two reservoir distribution-oblivious algorithms that are long-run-average optimal whenever sub- linear regret is statistically achievable are discussed.
MCMC Approaches to Rumor Source Inference using Pairwise Information
In this work, we examine the problem of rumor source inference on a network whose topology is known, given infected nodes and pairwise information in the form of pairwise partial orders on the set of…
From Finite to Countable-Armed Bandits: Appendix
- Anand Kalvit
Since the horizon of play is fixed at n, the decision maker may play at most n distinct arms. Therefore, it suffices to focus only on the sequence of the first n arms that may be played. A…
Bandits with Dynamic Arm-acquisition Costs
A bandit problem where at any time, the decision maker can add new arms to her consideration set is considered, and a new algorithm as well as analyses leading to tighter bounds for one from extant literature are proposed.
Dynamic Learning in Large Matching Markets
We study a sequential matching problem faced by large centralized platforms where "jobs" must be matched to "workers" subject to uncertainty about worker skill proficiencies. Jobs arrive at discrete…