• Publications
  • Influence
Demystifying MMD GANs
The situation with bias in GAN loss functions raised by recent work is clarified, and it is shown that gradient estimators used in the optimization process for both MMD GANs and Wasserstein GAns are unbiased, but learning a discriminator based on samples leads to biased gradients for the generator parameters.
Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy
This optimized MMD is applied to the setting of unsupervised learning by generative adversarial networks (GAN), in which a model attempts to generate realistic samples, and a discriminator attempts to tell these apart from data samples.
On the Error of Random Fourier Features
The uniform error bound of that paper on random Fourier features is improved, as well as giving novel understandings of the embedding's variance, approximation error, and use in some machine learning methods.
On gradient regularizers for MMD GANs
It is shown that controlling the gradient of the critic is vital to having a sensible loss function, and a method to enforce exact, analytical gradient constraints at no additional cost compared to existing approximate techniques based on additive regularizers is devised.
Learning Deep Kernels for Non-Parametric Two-Sample Tests
A class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution, which applies both to kernels on deep features and to simpler radial basis kernels or multiple kernel learning.
POT: Python Optimal Transport
A Python toolbox that implements several key optimal transport ideas for the machine learning community and contains implementations of a number of founding works of OT for machine learning such as Sinkhorn algorithm and Wasserstein barycenters, but also provides generic solvers that can be used for conducting novel fundamental research.
Nonparametric kernel estimators for image classification
We introduce a new discriminative learning method for image classification. We assume that the images are represented by unordered, multi-dimensional, finite sets of feature vectors, and that these
Learning deep kernels for exponential family densities
This work provides a scheme for learning a kernel parameterized by a deep network, which can find complex location-dependent local features of the data geometry, giving a very rich class of density models, capable of fitting complex structures on moderate-dimensional problems.
Does Invariant Risk Minimization Capture Invariance?
It is shown that the Invariant Risk Minimization (IRM) formulation can fail to capture “natural” invariances, at least when used in its practical “linear” form, and even on very simple problems which directly follow the motivating examples for IRM.
Scalable, Flexible and Active Learning on Distributions
This thesis investigates approximate embeddings into Euclideanspaces such that inner products in the embedding space approximate kernel values between the source distributions, and provides a greater understanding of the standard tool for doing so on Euclidean inputs, random Fourier features.