• Publications
  • Influence
mixup: Beyond Empirical Risk Minimization
TLDR
We propose mixup, a simple learning principle that improves the generalization of state-of-the-art neural network architectures. Expand
Gradient Episodic Memory for Continual Learning
TLDR
We propose a set of metrics to evaluate models learning over a continuum of data. Expand
Invariant Risk Minimization
TLDR
We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. Expand
Manifold Mixup: Better Representations by Interpolating Hidden States
TLDR
We propose Manifold Mixup, a simple regularizer that encourages neural networks to predict less confidently on interpolations of hidden representations. Expand
Interpolation Consistency Training for Semi-Supervised Learning
TLDR
We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm. Expand
Unifying distillation and privileged information
TLDR
This paper unifies these two techniques into generalized distillation, a framework to learn from multiple machines and data representations. Expand
The Randomized Dependence Coefficient
TLDR
We introduce the Randomized Dependence Coefficient (RDC), a measure of nonlinear dependence between random variables of arbitrary dimension based on the Hirschfeld-Gebelein-Renyi Maximum Correlation Coefficient. Expand
Optimizing the Latent Space of Generative Networks
TLDR
We introduce Generative Latent Optimization (GLO), a framework to train deep convolutional generators using simple reconstruction losses. Expand
In Search of Lost Domain Generalization
TLDR
The goal of domain generalization algorithms is to predict well on distributions different from those seen during training. Expand
Randomized Nonlinear Component Analysis
TLDR
We leverage randomness to design scalable new variants of nonlinear PCA and CCA; our ideas extend to key multivariate analysis tools such as spectral clustering or LDA. Expand
...
1
2
3
4
5
...