• Publications
  • Influence
A Fast and Accurate Face Detector Based on Neural Networks
TLDR
The level of performance reached, in terms of detection accuracy and processing time, allows us to apply this detector to a real world application: the indexing of images and videos.
Generic Exploration and K-armed Voting Bandits
TLDR
A generic pure-exploration algorithm, able to cope with various utility functions from multi-armed bandits settings to dueling bandits, is proposed, to offer a natural generalization of Dueling bandits for situations where the environment parameters reflect the idiosyncratic preferences of a mixed crowd.
A fast and accurate face detector for indexation of face images
TLDR
The level of performance reached in this approach, in terms of detection accuracy and processing time, allows this detector to apply to a real-world application: the indexation of face images on the WWW.
The non-stationary stochastic multi-armed bandit problem
We consider a variant of the stochastic multi-armed bandit with K arms where the rewards are not assumed to be identically distributed, but are generated by a non-stationary stochastic process. We
Random Forest for the Contextual Bandit Problem
TLDR
An online random forest algorithm is proposed to address the contextual bandit problem, based on the sample complexity needed to find the optimal decision stump, and it is shown that the proposed algorithm is optimal up to logarithmic factors.
A Neural Networks Committee for the Contextual Bandit Problem
TLDR
A new contextual bandit algorithm, NeuralBandit, which does not need hypothesis on stationarity of contexts and rewards is presented, and two variants, based on multi-experts approach, are proposed to choose online the parameters of multi-layer perceptrons.
EXP3 with drift detection for the switching bandit problem
TLDR
This paper considers a variant of the adversarial multi-armed bandit problem, where the time horizon is divided into unknown time periods within which rewards are drawn from stochastic distributions and presents an algorithm taking advantage of the constant exploration of EXP3 to detect when the best arm changes.
Contextual Bandit for Active Learning: Active Thompson Sampling
TLDR
A sequential algorithm named Active Thompson Sampling (ATS) is proposed, which, in each round, assigns a sampling distribution on the pool, samples one point from this distribution, and queries the oracle for this sample point label.
...
...