Learn More
We study the dynamic regret of multi-armed bandit and experts problem in non-stationary stochastic environments. We introduce a new parameter Λ, which measures the total statistical variance of the loss distributions over T rounds of the process, and study how this amount affects the regret. We investigate the interaction between Λ and Γ, which counts the(More)
  • 1