On kernel methods for covariates that are rankings

@article{Mania2016OnKM,
  title={On kernel methods for covariates that are rankings},
  author={Horia Mania and Aaditya Ramdas and Martin J. Wainwright and Michael I. Jordan and Benjamin Recht},
  journal={arXiv: Machine Learning},
  year={2016}
}
Permutation-valued features arise in a variety of applications, either in a direct way when preferences are elicited over a collection of items, or an indirect way in which numerical ratings are converted to a ranking. To date, there has been relatively limited study of regression, classification, and testing problems based on permutation-valued features, as opposed to permutation-valued responses. This paper studies the use of reproducing kernel Hilbert space methods for learning from… 
Sampling Permutations for Shapley Value Estimation
TLDR
This work investigates new approaches based on two classes of approximation methods and compares them empirically, and demonstrates quadrature techniques in a RKHS containing functions of permutations, using the Mallows kernel in combination with kernel herding and sequential Bayesian quadratures.
Bayesian Optimization over Hybrid Spaces
TLDR
This paper develops a principled approach for constructing diffusion kernels over hybrid spaces by utilizing the additive kernel formulation, which allows additive interactions of all orders in a tractable manner and theoretically analyze the modeling strength of additive hybrid kernels and prove that it has the universal approximation property.
Gaussian field on the symmetric group: Prediction and learning
TLDR
This paper proposes and studies an harmonic analysis of the covariance operators that allows to put into action the full machinery of Gaussian processes learning in the less classical case where X is the non commutative finite group of permutations.
Two-Sample Testing on Ranked Preference Data and the Role of Modeling Assumptions
TLDR
This paper designs two-sample tests for pairwise comparison data and ranking data, establishes an upper bound on the sample complexity required to correctly distinguish between the distributions of the two sets of samples, and investigates the role of modeling assumptions by proving lower bounds for a range of pairwise comparisons models.
Gaussian Processes indexed on the symmetric group: prediction and learning
TLDR
This paper proposes and study an harmonic analysis of the covariance operators that enables to consider Gaussian processes models and forecasting issues and is motivated by statistical ranking problems.
Bayesian Optimization over Permutation Spaces
TLDR
Two algorithms for BO over Permutation Spaces (BOPS) are proposed and evaluated, showing that both BOPS-T and Bops-H perform better than the state-of-the-art BO algorithm for combinatorial spaces.
New developments around dependence measures for sensitivity analysis: application to severe accident studies for generation IV reactors (English version)
TLDR
The work carried out in this thesis aims at proposing new statistical methods based on dependence measures for GSA of numerical simulators, particularly interested in HSIC-type dependence measures (Hilbert-Schmidt Independence Criterion).
Bandwidth-Optimal Random Shuffling for GPUs
TLDR
Experimental results show that the bijective shuffle algorithm outperforms competing algorithms on GPUs, showing improvements of between one and two orders of magnitude and approaching peak device bandwidth.
Catch Me if I Can: Detecting Strategic Behaviour in Peer Assessment
TLDR
This paper designs a principled test for detecting strategic behaviour, designs an experiment that elicits strategic behaviour from subjects and releases a dataset of patterns of strategic behaviour that may be of independent interest, and proves that the test has strong false alarm guarantees.
Fourier Bases for Solving Permutation Puzzles
TLDR
The effectiveness of learning a value function in the Fourier basis for solving various permutation puzzles is demonstrated and it is shown that it outperforms standard deep learning methods.
...
1
2
...

References

SHOWING 1-10 OF 21 REFERENCES
A Kernel Two-Sample Test
TLDR
This work proposes a framework for analyzing and comparing distributions, which is used to construct statistical tests to determine if two samples are drawn from different distributions, and presents two distribution free tests based on large deviation bounds for the maximum mean discrepancy (MMD).
Group representations in probability and statistics
Fourier Theoretic Probabilistic Inference over Permutations
TLDR
This paper uses the "low-frequency" terms of a Fourier decomposition to represent distributions over permutations compactly, and presents Kronecker conditioning, a novel approach for maintaining and updating these distributions directly in the Fourier domain, allowing for polynomial time bandlimited approximations.
The Kendall and Mallows Kernels for Permutations
TLDR
It is shown that the widely used Kendall tau correlation coefficient, and the related Mallows kernel, are positive definite kernels for permutations, and how to extend these kernels to partial rankings, multivariate rankings and uncertain rankings.
Adaptivity and Computation-Statistics Tradeoffs for Kernel and Distance based High Dimensional Two Sample Testing
TLDR
This paper formally characterize the power of popular tests for GDA like the Maximum Mean Discrepancy with the Gaussian kernel (gMMD) and bandwidth-dependent variants of the Energy Distance with the Euclidean norm (eED) in the high-dimensional MDA regime.
Scikit-learn: Machine Learning in Python
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing
A two-sample test for high-dimensional data with applications to gene-set testing
We proposed a two sample test for means of high dimensional data when the data dimension is much larger than the sample size. The classical Hotelling's $T^2$ test does not work for this ``large p,
Modeling heterogeneity in ranked responses by nonparametric maximum likelihood: How do Europeans get their scientific knowledge?
This paper is motivated by a Eurobarometer survey on science knowledge. As part of the survey, respondents were asked to rank sources of science information in order of importance. The official
Ranking with Kernels in Fourier space
Universal Kernels on Non-Standard Input Spaces
TLDR
This work provides a general technique based on Taylor-type kernels to explicitly construct universal kernels on compact metric spaces which are not subset of ℝd, and applies this technique for the following special cases: universal kernel on the set of probability measures, universal kernels based on Fourier transforms, and universal kernels for signal processing.
...
1
2
3
...