#### Filter Results:

- Full text PDF available (6)

#### Publication Year

2014

2016

#### Publication Type

#### Co-author

#### Publication Venue

#### Key Phrases

Learn More

- Michal Valko, Rémi Munos, Branislav Kveton, Tomás Kocák
- ICML
- 2014

Smooth functions on graphs have wide applications in manifold and semi-supervised learning. In this paper, we study a bandit problem where the payoffs of arms are smooth on a graph. This framework is suitable for solving online learning problems that involve graphs, such as content-based recommendation. In this problem, each item we can recommend is a node… (More)

- Tomás Kocák, Gergely Neu, Michal Valko, Rémi Munos
- NIPS
- 2014

We consider online learning problems under a a partial observability model capturing situations where the information conveyed to the learner is between full information and bandit feedback. In the simplest variant, we assume that in addition to its own loss, the learner also gets to observe losses of some other actions. The revealed losses depend on the… (More)

- Tomás Kocák, Michal Valko, Rémi Munos, Shipra Agrawal
- AAAI
- 2014

Thompson Sampling (TS) has surged a lot of interest due to its good empirical performance, in particular in the computational advertising. Though successful, the tools for its performance analysis appeared only recently. In this paper, we describe and analyze SpectralTS algorithm for a bandit problem , where the payoffs of the choices are smooth given an… (More)

- Tomás Kocák, Gergely Neu, Michal Valko
- AISTATS
- 2016

We propose a new partial-observability model for online learning problems where the learner, besides its own loss, also observes some noisy feedback about the other actions, depending on the underlying structure of the problem. We represent this structure by a weighted directed graph, where the edge weights are related to the quality of the feedback shared… (More)

- Tomás Kocák, Gergely Neu, Michal Valko
- UAI
- 2016

We consider adversarial multi-armed bandit problems where the learner is allowed to observe losses of a number of arms beside the arm that it actually chose. We study the case where all non-chosen arms reveal their loss with an unknown probability r t , independently of each other and the action of the learner. Moreover, we allow r t to change in every… (More)

Smooth functions on graphs have wide applications in man-ifold and semi-supervised learning. In this paper, we study a bandit problem where the payoffs of arms are smooth on a graph. This framework is suitable for solving online learning problems that involve graphs, such as content-based recommendation. In this problem, each recommended item is a node and… (More)

- ‹
- 1
- ›