Linear Thompson Sampling Revisited
- Marc Abeille, A. Lazaric
- Computer ScienceInternational Conference on Artificial…
- 1 November 2016
Thompson sampling can be seen as a generic randomized algorithm where the sampling distribution is designed to have a fixed probability of being optimistic, at the cost of an additional $\sqrt{d}$ regret factor compared to a UCB-like approach.
Improved Optimistic Algorithms for Logistic Bandits
- Louis Faury, Marc Abeille, Clément Calauzènes, Olivier Fercoq
- Computer ScienceInternational Conference on Machine Learning
- 18 February 2020
A new optimistic algorithm is proposed based on a finer examination of the non-linearities of the reward function that enjoys a $\tilde{\mathcal{O}}(\sqrt{T})$ regret with no dependency in $\kappa$, but for a second order term.
Thompson Sampling for Linear-Quadratic Control Problems
- Marc Abeille, A. Lazaric
- Computer ScienceInternational Conference on Artificial…
- 27 March 2017
The regret of Thompson sampling in the frequentist setting is analyzed, which results in an overall regret of O(T^{2/3})$, which is significantly worse than the regret achieved by the optimism-in-face-of-uncertainty algorithm in LQ control problems.
Improved Regret Bounds for Thompson Sampling in Linear Quadratic Control Problems
- Marc Abeille, A. Lazaric
- Computer Science, MathematicsInternational Conference on Machine Learning
- 3 July 2018
A novel bound on the regret due to policy switches is obtained, which holds for LQ systems of any dimensionality and it allows updating the parameters and the policy at each step, thus overcoming previous limitations due to lazy updates.
Instance-Wise Minimax-Optimal Algorithms for Logistic Bandits
- Marc Abeille, Louis Faury, Clément Calauzènes
- Computer ScienceInternational Conference on Artificial…
- 23 October 2020
A novel algorithm is introduced for which the permanent regime non-linearity can dramatically ease the exploration-exploitation trade-off and it is proved that this rate is minimax-optimal by deriving a $\Omega(d\sqrt{T/\kappa})$ problem-dependent lower-bound.
Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation
- Marc Abeille, A. Lazaric
- Computer ScienceInternational Conference on Machine Learning
- 12 July 2020
This work proposes to relax the optimistic optimization of \ofulq and cast it into a constrained LQR problem, where an additional control variable implicitly selects the system dynamics within a confidence interval, and proves strong duality.
Regret Bounds for Generalized Linear Bandits under Parameter Drift
- Louis Faury, Yoan Russac, Marc Abeille, Clément Calauzènes
- Computer ScienceArXiv
- 9 March 2021
This work introduces a new algorithm that addresses central mechanisms inherited from the Linear Bandit setting by explicitly splitting the treatment of the learning and tracking aspects of the problem, and proves that under a geometric assumption on the action set, this approach enjoys a regret bound.
Thompson Sampling in Non-Episodic Restless Bandits
- Young Hun Jung, Marc Abeille, Ambuj Tewari
- Computer ScienceArXiv
- 12 October 2019
The algorithm adapts the TSDE algorithm of Ouyang et al. (2017) in a non-trivial manner to account for the special structure of restless bandits and proves a sub-linear, $O(\sqrt{T}\log T)$, regret bound for a variant of Thompson sampling.
LQG for Portfolio Optimization
- Marc Abeille, E. S'eri'e, A. Lazaric, Xavier Brokmann
- Economics
- 3 November 2016
We introduce a generic solver for dynamic portfolio allocation problems when the market exhibits return predictability, price impact and partial observability. We assume that the price modeling can…
Explicit shading strategies for repeated truthful auctions
- Marc Abeille, Clément Calauzènes, Noureddine El Karoui, Thomas Nedelec, Vianney Perchet
- EconomicsArXiv
- 1 May 2018
It is concluded that a return to simple first price auctions with no reserve price or at least non-dynamic anonymous ones is desirable from the point of view of both buyers, sellers and increasing transparency.
...
...