Corpus ID: 230433689

Etat de l'art sur l'application des bandits multi-bras

@article{Bouneffouf2021EtatDL,
  title={Etat de l'art sur l'application des bandits multi-bras},
  author={Djallel Bouneffouf},
  journal={ArXiv},
  year={2021},
  volume={abs/2101.00001}
}
Le domaine des bandits multi-bras connaı̂t actuellement une renaissance, alors que de nouveaux paramètres de problèmes et des algorithmes motivés par diverses applications pratiques sont introduits, en s’ajoutant au problème classique des bandits. Cet article vise à fournir un examen complet des principaux développements récents dans de multiples applications réelles des bandits. Plus précisément, nous introduisons une taxonomie des applications communes basées sur le MAB et résumons l’état de… Expand

Tables from this paper

References

SHOWING 1-10 OF 115 REFERENCES
Prise de décision contextuelle en bande organisée : Quand les bandits font un brainstorming
Dans cet article, nous proposons un nouvel algorithme de bandits contextuels, NeuralBandit, ne faisant aucune hypothese de stationnarite sur les contextes et les recompenses. L'algorithme proposeExpand
Rôle de l'inférence temporelle dans la reconnaissance de l'inférence textuelle
Ce projet s‟insere dans le cadre du traitement du langage nature. Il a pour objectif le developpement d‟un systeme de reconnaissance d‟inference textuelle, nomme TIMINF. Ce type de systeme permet deExpand
" L'apprentissage automatique ", une étape importante dans l'adaptation des systèmes d'information à l'utilisateur
Les travaux abordes ici se situent dans le domaine de l'informatique sensible au contexte. Leur objectif est de faciliter l'acces a l'information d'un utilisateur, via un systeme d'information, parExpand
A Survey on Practical Applications of Multi-Armed and Contextual Bandits
TLDR
A taxonomy of common MAB-based applications is introduced and state-of-art for each of those domains is summarized, to identify important current trends and provide new perspectives pertaining to the future of this exciting and fast-growing field. Expand
Recommandation mobile, sensible au contexte de contenus évolutifs: Contextuel-E-Greedy
TLDR
An algorithm named Contextuel-E-Greedy, based on dynamic exploration/exploitation tradeoff and can adaptively balance the two aspects by deciding which situation is most relevant for exploration or exploitation, is introduced. Expand
Algorithm Selection as a Bandit Problem with Unbounded Losses
TLDR
This paper adapts an existing solver to this game, proving a bound on its expected regret, which holds also for the resulting algorithm selection technique, and presents experiments with a set of SAT solvers on a mixed SAT-UNSAT benchmark. Expand
Node-based optimization of LoRa transmissions with Multi-Armed Bandit algorithms
TLDR
The possibility to optimize the performance of the LoRaWAN technology is demonstrated, namely, multi-armed bandit algorithms, to select the communication parameters (spreading factor and emission power) and extensive simulations show that such learning methods allow to manage the trade-off between energy consumption and packet loss much better. Expand
Large-Scale Bandit Approaches for Recommender Systems
TLDR
This paper proposes two large-scale bandit approaches under the situations that there is no available priori information, and theoretically proves that these approaches can converge to optimal item recommendations in the long run. Expand
Optimal Exploitation of Clustering and History Information in Multi-Armed Bandit
TLDR
The META algorithm is developed, which effectively hedges between two other algorithms: one which uses both historical observations and clustering, and another which uses only the historical observations. Expand
Multi-armed bandit problem with known trend
TLDR
The new algorithm named Adjusted Upper Confidence Bound (A-UCB) is proposed that assumes a stochastic model and provides upper bounds of the regret which compare favorably with the ones of UCB1. Expand
...
1
2
3
4
5
...