• Corpus ID: 174802436

Noise Contrastive Meta-Learning for Conditional Density Estimation using Kernel Mean Embeddings

@inproceedings{Ton2021NoiseCM,
  title={Noise Contrastive Meta-Learning for Conditional Density Estimation using Kernel Mean Embeddings},
  author={Jean-Francois Ton and Lucian Chan and Yee Whye Teh and D. Sejdinovic},
  booktitle={AISTATS},
  year={2021}
}
Current meta-learning approaches focus on learning functional representations of relationships between variables, i.e. on estimating conditional expectations in regression. In many applications, however, we are faced with conditional distributions which cannot be meaningfully summarized using expectation only (due to e.g. multimodality). Hence, we consider the problem of conditional density estimation in the meta-learning setting. We introduce a novel technique for meta-learning which combines… 
Kernel Conditional Density Operators
We introduce a novel conditional density estimation model termed the conditional density operator (CDO). It naturally captures multivariate, multimodal output densities and shows performance that is
Score Matched Neural Exponential Families for Likelihood-Free Inference
Bayesian Likelihood-Free Inference (LFI) approaches allow to obtain posterior distributions for stochastic models with intractable likelihood, by relying on model simulations. In Approximate Bayesian
BayesIMP: Uncertainty Quantification for Causal Data Fusion
TLDR
Bayesian Interventional Mean Processes is introduced, a framework which combines ideas from probabilistic integration and kernel mean embeddings to represent interventional distributions in the reproducing kernel Hilbert space, while taking into account the uncertainty within each causal graph.
Function Contrastive Learning of Transferable Meta-Representations
TLDR
A decoupled encoder-decoder approach to supervised meta-learning, where the encoder is trained with a contrastive objective to find a good representation of the underlying function and the representations obtained outperform strong baselines in terms of downstream performance and noise robustness.
Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data
TLDR
This work provides both theoretical justification and empirical evidence that the proposed meta-testing schemes outperform learning kernel-based tests directly from scarce observations, and identifies when such schemes will be successful.
On Contrastive Representations of Stochastic Processes
TLDR
A unifying framework for learning contrastive representations of stochastic processes (CRESP) that does away with exact reconstruction, and whose methods tolerate noisy high-dimensional observations better than traditional approaches, and the learned representations transfer to a range of downstream tasks.
Score Matched Conditional Exponential Families for Likelihood-Free Inference∗
To perform Bayesian inference for stochastic simulator models for which the likelihood is not accessible, Likelihood-Free Inference (LFI) relies on simulations from the model. Standard LFI methods

References

SHOWING 1-10 OF 37 REFERENCES
Conditional mean embeddings as regressors - supplementary
TLDR
This work demonstrates an equivalence between reproducing kernel Hilbert space (RKHS) embeddings of conditional distributions and vector-valued regressors and derives minimax convergence rates which are valid under milder and more intuitive assumptions.
Gaussian Process Conditional Density Estimation
TLDR
This work proposes to extend the model's input with latent variables and use Gaussian processes to map this augmented input onto samples from the conditional distribution, and illustrates the effectiveness and wide-reaching applicability of the model on a variety of real-world problems.
Kernel Exponential Family Estimation via Doubly Dual Embedding
TLDR
A connection between kernel exponential family estimation and MMD-GANs is established, revealing a new perspective for understanding GANs and it is shown that the proposed estimator empirically outperforms state-of-the-art estimators.
Representation Learning with Contrastive Predictive Coding
TLDR
This work proposes a universal unsupervised learning approach to extract useful representations from high-dimensional data, which it calls Contrastive Predictive Coding, and demonstrates that the approach is able to learn useful representations achieving strong performance on four distinct domains: speech, images, text and reinforcement learning in 3D environments.
Deep Kernel Learning
We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods. Specifically, we transform the inputs
Least-Squares Conditional Density Estimation
TLDR
A novel method of conditional density estimation that is suitable for multi-dimensional continuous variables and expresses the conditional density in terms of the density ratio and the ratio is directly estimated without going through density estimation.
Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency
TLDR
It is shown that the ranking-based variant of NCE gives consistent parameter estimates under weaker assumptions than the classification-based method, which is closely related to negative sampling methods, now widely used in NLP.
Kernel Mean Embedding of Distributions: A Review and Beyonds
TLDR
A comprehensive review of existing work and recent advances in the Hilbert space embedding of distributions, and to discuss the most challenging issues and open problems that could lead to new research directions.
Empirical representations of probability distributions via kernel mean embeddings
How to represent probability distributions is a fundamental issue in statistical methods, as this determines all the subsequent estimation procedures. In this thesis, we focus on the approach using
A Theoretical Analysis of Contrastive Unsupervised Representation Learning
TLDR
This framework allows us to show provable guarantees on the performance of the learned representations on the average classification task that is comprised of a subset of the same set of latent classes and shows that learned representations can reduce (labeled) sample complexity on downstream tasks.
...
1
2
3
4
...