On kernel methods for covariates that are rankings
@article{Mania2016OnKM, title={On kernel methods for covariates that are rankings}, author={Horia Mania and Aaditya Ramdas and Martin J. Wainwright and Michael I. Jordan and Benjamin Recht}, journal={arXiv: Machine Learning}, year={2016} }
Permutation-valued features arise in a variety of applications, either in a direct way when preferences are elicited over a collection of items, or an indirect way in which numerical ratings are converted to a ranking. To date, there has been relatively limited study of regression, classification, and testing problems based on permutation-valued features, as opposed to permutation-valued responses. This paper studies the use of reproducing kernel Hilbert space methods for learning from…
18 Citations
Sampling Permutations for Shapley Value Estimation
- Computer ScienceJ. Mach. Learn. Res.
- 2022
This work investigates new approaches based on two classes of approximation methods and compares them empirically, and demonstrates quadrature techniques in a RKHS containing functions of permutations, using the Mallows kernel in combination with kernel herding and sequential Bayesian quadratures.
Bayesian Optimization over Hybrid Spaces
- Computer ScienceICML
- 2021
This paper develops a principled approach for constructing diffusion kernels over hybrid spaces by utilizing the additive kernel formulation, which allows additive interactions of all orders in a tractable manner and theoretically analyze the modeling strength of additive hybrid kernels and prove that it has the universal approximation property.
Gaussian field on the symmetric group: Prediction and learning
- Computer Science, Mathematics
- 2020
This paper proposes and studies an harmonic analysis of the covariance operators that allows to put into action the full machinery of Gaussian processes learning in the less classical case where X is the non commutative finite group of permutations.
Two-Sample Testing on Ranked Preference Data and the Role of Modeling Assumptions
- Computer ScienceArXiv
- 2020
This paper designs two-sample tests for pairwise comparison data and ranking data, establishes an upper bound on the sample complexity required to correctly distinguish between the distributions of the two sets of samples, and investigates the role of modeling assumptions by proving lower bounds for a range of pairwise comparisons models.
Gaussian Processes indexed on the symmetric group: prediction and learning
- Computer Science, Mathematics
- 2018
This paper proposes and study an harmonic analysis of the covariance operators that enables to consider Gaussian processes models and forecasting issues and is motivated by statistical ranking problems.
Bayesian Optimization over Permutation Spaces
- Computer ScienceArXiv
- 2021
Two algorithms for BO over Permutation Spaces (BOPS) are proposed and evaluated, showing that both BOPS-T and Bops-H perform better than the state-of-the-art BO algorithm for combinatorial spaces.
New developments around dependence measures for sensitivity analysis: application to severe accident studies for generation IV reactors (English version)
- Computer Science
- 2019
The work carried out in this thesis aims at proposing new statistical methods based on dependence measures for GSA of numerical simulators, particularly interested in HSIC-type dependence measures (Hilbert-Schmidt Independence Criterion).
Bandwidth-Optimal Random Shuffling for GPUs
- Computer ScienceACM Trans. Parallel Comput.
- 2022
Experimental results show that the bijective shuffle algorithm outperforms competing algorithms on GPUs, showing improvements of between one and two orders of magnitude and approaching peak device bandwidth.
Catch Me if I Can: Detecting Strategic Behaviour in Peer Assessment
- Computer ScienceAAAI
- 2021
This paper designs a principled test for detecting strategic behaviour, designs an experiment that elicits strategic behaviour from subjects and releases a dataset of patterns of strategic behaviour that may be of independent interest, and proves that the test has strong false alarm guarantees.
Fourier Bases for Solving Permutation Puzzles
- Computer ScienceAISTATS
- 2021
The effectiveness of learning a value function in the Fourier basis for solving various permutation puzzles is demonstrated and it is shown that it outperforms standard deep learning methods.
References
SHOWING 1-10 OF 21 REFERENCES
A Kernel Two-Sample Test
- Mathematics, Computer ScienceJ. Mach. Learn. Res.
- 2012
This work proposes a framework for analyzing and comparing distributions, which is used to construct statistical tests to determine if two samples are drawn from different distributions, and presents two distribution free tests based on large deviation bounds for the maximum mean discrepancy (MMD).
Fourier Theoretic Probabilistic Inference over Permutations
- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2009
This paper uses the "low-frequency" terms of a Fourier decomposition to represent distributions over permutations compactly, and presents Kronecker conditioning, a novel approach for maintaining and updating these distributions directly in the Fourier domain, allowing for polynomial time bandlimited approximations.
The Kendall and Mallows Kernels for Permutations
- Computer Science, MathematicsIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2018
It is shown that the widely used Kendall tau correlation coefficient, and the related Mallows kernel, are positive definite kernels for permutations, and how to extend these kernels to partial rankings, multivariate rankings and uncertain rankings.
Adaptivity and Computation-Statistics Tradeoffs for Kernel and Distance based High Dimensional Two Sample Testing
- Computer Science, MathematicsArXiv
- 2015
This paper formally characterize the power of popular tests for GDA like the Maximum Mean Discrepancy with the Gaussian kernel (gMMD) and bandwidth-dependent variants of the Energy Distance with the Euclidean norm (eED) in the high-dimensional MDA regime.
Scikit-learn: Machine Learning in Python
- Computer ScienceJ. Mach. Learn. Res.
- 2011
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing…
A two-sample test for high-dimensional data with applications to gene-set testing
- Computer Science, Mathematics
- 2010
We proposed a two sample test for means of high dimensional data when the data dimension is much larger than the sample size. The classical Hotelling's $T^2$ test does not work for this ``large p,…
Modeling heterogeneity in ranked responses by nonparametric maximum likelihood: How do Europeans get their scientific knowledge?
- Mathematics
- 2010
This paper is motivated by a Eurobarometer survey on science knowledge. As part of the survey, respondents were asked to rank sources of science information in order of importance. The official…
Universal Kernels on Non-Standard Input Spaces
- Computer ScienceNIPS
- 2010
This work provides a general technique based on Taylor-type kernels to explicitly construct universal kernels on compact metric spaces which are not subset of ℝd, and applies this technique for the following special cases: universal kernel on the set of probability measures, universal kernels based on Fourier transforms, and universal kernels for signal processing.