• Publications
  • Influence
The non-stationary stochastic multi-armed bandit problem
TLDR
We consider a variant of the stochastic multi-armed bandit with K arms where the rewards are not assumed to be identically distributed, but are generated by a non-stationary process. Expand
  • 29
  • 3
  • PDF
EXP3 with drift detection for the switching bandit problem
TLDR
The multi-armed bandit is a model of exploration and exploitation, where one must select, within a finite set of arms, the one which maximizes the cumulative reward up to the time horizon. Expand
  • 27
  • 3
A Neural Networks Committee for the Contextual Bandit Problem
TLDR
This paper presents a new contextual bandit algorithm, NeuralBandit, which does not need hypothesis on stationarity of contexts and rewards. Expand
  • 50
  • 2
  • PDF
Contextual Bandit for Active Learning: Active Thompson Sampling
TLDR
The labelling of training examples is a costly task in a supervised classification. Expand
  • 40
  • 2
  • PDF
Random Forest for the Contextual Bandit Problem
TLDR
We propose an online random forest algorithm, BANDIT FOREST, which is optimal up to logarithmic factors. Expand
  • 27
  • 2
  • PDF
Deep Learning Based Approach for Entity Resolution in Databases
TLDR
This paper proposes a Deep Neural Networks (DNN) based approach for entity resolution in databases. Expand
  • 6
  • 1
Selection of learning experts
TLDR
We reduce the selection of learning experts to stochastic MAB problems and propose a randomized variant of SUCCESSIVE ELIMINATION. Expand
  • 3
  • 1
Bandits Manchots sur Flux de Données Non Stationnaires. (Multi-armed bandits for non-stationary data streams)
TLDR
Le probleme des bandits manchots est un cadre theorique permettant d'etudier le compromis entre exploration et exploitation lorsque l'information observee est partielle. Expand
  • 1
  • 1
Random Shuffling and Resets for the Non-stationary Stochastic Bandit Problem
TLDR
We consider a non-stationary formulation of the stochastic multi-armed bandit where the rewards are no longer assumed to be identically distributed. Expand
  • 1
  • PDF
Prise de décision contextuelle en bande organisée : Quand les bandits font un brainstorming
Dans cet article, nous proposons un nouvel algorithme de bandits contextuels, NeuralBandit, ne faisant aucune hypothese de stationnarite sur les contextes et les recompenses. L'algorithme proposeExpand
  • 2
  • PDF