• Corpus ID: 67856507

# From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model

@inproceedings{Saha2019FromPT,
title={From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model},
booktitle={International Conference on Machine Learning},
year={2019}
}
• Published in
International Conference on…
1 March 2019
• Computer Science
We consider PAC-learning a good item from $k$-subsetwise feedback information sampled from a Plackett-Luce probability model, with instance-dependent sample complexity performance. In the setting where subsets of a fixed size can be tested and top-ranked feedback is made available to the learner, we give an algorithm with optimal instance-dependent sample complexity, for PAC best arm identification, of $O\bigg(\frac{\theta_{[k]}}{k}\sum_{i = 2}^n\max\Big(1,\frac{1}{\Delta_i^2}\Big) \ln\frac{k… ## Figures from this paper • Computer Science ICML • 2020 This paper studies the sample complexity (aka number of comparisons) bounds for the active best-$k$items selection from pairwise comparisons and proposes two algorithms based on PAC best items selection algorithms that works for$k=1 and is sample complexity optimal up to a loglog factor.
• Computer Science
NeurIPS
• 2019
This paper aims at the exact ranking without knowledge on the instances, while most of the previous works either focus on approximate rankings or study exact ranking but require prior knowledge.
• Computer Science
ArXiv
• 2022
This work proposes a novel reduction from any (general) dueling bandits to multi-armed bandits and despite the simplicity, it allows us to improve many existing results in Dueling bandits.
• Computer Science
• 2022
The robustness of the proposed algorithm is justified by proving its optimal regret rate under adversarially corrupted preferences—this outperforms the existing state-of-the-art corrupted dueling results by a large margin.
• Computer Science, Mathematics
ArXiv
• 2022
A generic algorithm suitable to cover the full spectrum of conceivable arm elimination strategies from aggressive to conservative is suggested and theoretical questions about thecient and necessary budget of the algorithm to choose the best arm are answered and complemented by deriving lower bounds for any learning algorithm for this problem scenario.
• Computer Science
• 2021
Whether and to what degree utilizing multi-wise comparisons can reduce the sample complexity for the ranking problems compared to ranking from pairwise comparisons is helps understand.
• Computer Science
NeurIPS
• 2021
The Dvoretzky–Kiefer–Wolfowitz tournament (DKWT) algorithm is proposed, which proves to be nearly optimal and empirically outperforms current state-of-the-art algorithms, even in the special case of dueling bandits or under a Plackett-Luce assumption on the feedback mechanism.
• Computer Science
ArXiv
• 2022
This work proposes a new notion of Internal Regret for sleeping MAB, and proposes an algorithm that yields sublinear regret in that measure, even for a completely adversarial sequence of losses and availabilities.
This paper investigates the elicitation of necessarily Pareto optimal (NPO) and necessarily rank-maximal (NRM) matchings and answers an open question and gives an online algorithm for eliciting an NRM matching in the next-best query model which is 3/2-competitive.
• Computer Science
ArXiv
• 2022
An elimination-based rescheduling algorithm is developed and shown to be a near-optimal dynamic regret bound, where S CW is the number of times the Condorcet winner changes in T rounds.

## References

SHOWING 1-10 OF 41 REFERENCES

• Computer Science
ALT
• 2019
Two algorithms are proposed for the PAC problem with the TR feedback model with optimal (upto logarithmic factors) sample complexity guarantees, establishing the increase in statistical efficiency from exploiting rank-ordered feedback.
• Computer Science
AISTATS
• 2019
These algorithms rely on a novel {pivot trick} to maintain only $n$ itemwise score estimates, unlike $O(n^2)$ pairwise score estimate estimates that has been used in prior work.
• Computer Science
ArXiv
• 2018
We introduce the probably approximately correct (PAC) version of the problem of Battling-bandits with the Plackett-Luce (PL) model – an online learning framework where in each trial, the learner
• Computer Science
ArXiv
• 2018
This paper derives a lower bound on the sample complexity (aka number of queries), and proposes an algorithm that is sample-complexity-optimal up to an $O(\log(k+l)/\log{k})$ factor and designs ranking algorithms that recover the top-$k$ or total ranking using as few queries as possible.
• Computer Science
NIPS
• 2017
This work examines an M-wise comparison model that builds on the Plackett-Luce model where for each sample, M items are ranked according to their perceived utilities modeled as noisy observations of their underlying true utilities.
• Computer Science
SODA
• 2018
This work designs a new active ranking algorithm without using any information about the underlying items' preference scores, and establishes a matching lower bound on the sample complexity even when the set of preference scores is given to the algorithm.
• Computer Science
ICML
• 2012
The expected sample complexity bound for LUCB is novel even for single-arm selection, and a lower bound on the worst case sample complexity of PAC algorithms for Explore-m is given.
• Computer Science
SODA
• 2017
A linear time algorithm is presented which has a competitive ratio of O( √ n) times as many samples needed as the best possible algorithm for that instance of top-K, and it is shown that this is tight: any algorithm for thetop-K problem has competitive ratio at least Ω(√ n).
• Computer Science
J. Mach. Learn. Res.
• 2016
This work introduces generic notions of complexity for the two dominant frameworks considered in the literature: fixed-budget and fixed-confidence settings, and provides the first known distribution-dependent lower bound on the complexity that involves information-theoretic quantities and holds when m ≥ 1 under general assumptions.