Learn More
In this paper the sequential prediction problem with expert advice is considered when the loss is unbounded under partial monitoring scenarios. We deal with a wide class of the partial monitoring problems: the combination of the label efficient and multi-armed bandit problem, that is, where the algorithm is only informed about the performance of the chosen(More)
The loss version of the multi-armed bandit problem is carried out in T iterations. At the beginning of any iteration an adversary assigns losses from [0, l] to each of the K options (also called arms). Then, without knowing the adversary's assignments, we are required to select one out of the K arms, and suffer the loss that was assigned to it. Here we(More)
  • 1