# From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model

• Published in
International Conference on…
1 March 2019
• Computer Science
We consider PAC-learning a good item from $k$-subsetwise feedback information sampled from a Plackett-Luce probability model, with instance-dependent sample complexity performance. In the setting where subsets of a fixed size can be tested and top-ranked feedback is made available to the learner, we give an algorithm with optimal instance-dependent sample complexity, for PAC best arm identification, of $O\bigg(\frac{\theta_{[k]}}{k}\sum_{i = 2}^n\max\Big(1,\frac{1}{\Delta_i^2}\Big) \ln\frac{k… ## Figures from this paper • Computer Science ICML • 2020 This paper studies the sample complexity (aka number of comparisons) bounds for the active best-$k$items selection from pairwise comparisons and proposes two algorithms based on PAC best items selection algorithms that works for$k=1 and is sample complexity optimal up to a loglog factor.
• Computer Science
NeurIPS
• 2019
This paper aims at the exact ranking without knowledge on the instances, while most of the previous works either focus on approximate rankings or study exact ranking but require prior knowledge.
• Computer Science
ArXiv
• 2022
This work proposes a novel reduction from any (general) dueling bandits to multi-armed bandits and despite the simplicity, it allows us to improve many existing results in Dueling bandits.
• Computer Science
• 2022
The robustness of the proposed algorithm is justified by proving its optimal regret rate under adversarially corrupted preferences—this outperforms the existing state-of-the-art corrupted dueling results by a large margin.
• Computer Science, Mathematics
ArXiv
• 2022
A generic algorithm suitable to cover the full spectrum of conceivable arm elimination strategies from aggressive to conservative is suggested and theoretical questions about thecient and necessary budget of the algorithm to choose the best arm are answered and complemented by deriving lower bounds for any learning algorithm for this problem scenario.
• Computer Science
• 2021
Whether and to what degree utilizing multi-wise comparisons can reduce the sample complexity for the ranking problems compared to ranking from pairwise comparisons is helps understand.
• Computer Science
NeurIPS
• 2021
The Dvoretzky–Kiefer–Wolfowitz tournament (DKWT) algorithm is proposed, which proves to be nearly optimal and empirically outperforms current state-of-the-art algorithms, even in the special case of dueling bandits or under a Plackett-Luce assumption on the feedback mechanism.
• Computer Science
ArXiv
• 2022
This work proposes a new notion of Internal Regret for sleeping MAB, and proposes an algorithm that yields sublinear regret in that measure, even for a completely adversarial sequence of losses and availabilities.
This paper investigates the elicitation of necessarily Pareto optimal (NPO) and necessarily rank-maximal (NRM) matchings and answers an open question and gives an online algorithm for eliciting an NRM matching in the next-best query model which is 3/2-competitive.
• Computer Science
ArXiv
• 2022
An elimination-based rescheduling algorithm is developed and shown to be a near-optimal dynamic regret bound, where S CW is the number of times the Condorcet winner changes in T rounds.

