A Rigorous Theory of Conditional Mean Embeddings

  title={A Rigorous Theory of Conditional Mean Embeddings},
  author={Ilja Klebanov and Ingmar Schuster and Timothy John Sullivan},
  journal={SIAM J. Math. Data Sci.},
Conditional mean embeddings (CMEs) have proven themselves to be a powerful tool in many machine learning applications. They allow the efficient conditioning of probability distributions within the ... 

Figures from this paper

Kernel autocovariance operators of stationary processes: Estimation and convergence
The approach is used to examine the nonparametric estimation of Markov transition operators and highlight how the theory can give a consistency analysis for a large family of spectral analysis methods including kernel-based dynamic mode decomposition.
Conditional Bures Metric for Domain Adaptation
  • You-Wei Luo, Chuan-Xian Ren
  • Computer Science
    2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2021
This work focuses on the conditional distribution shift problem, and proposes the Conditional Kernel Bures (CKB) metric for characterizing conditional distribution discrepancy, and derives an empirical estimation for the CKB metric without introducing the implicit kernel feature map.
Kernel Partial Correlation Coefficient -- a Measure of Conditional Dependence
In this paper we propose and study a class of simple, nonparametric, yet interpretable measures of conditional dependence between two random variables Y and Z given a third variable X, all taking
Convergence Rates for Learning Linear Operators from Noisy Data
This work establishes posterior contraction rates with respect to a family of Bochner norms as the number of data tend to infinity and derive related lower bounds on the estimation error and connects the posterior consistency results to nonparametric learning theory.
Sobolev Norm Learning Rates for Conditional Mean Embeddings
This work develops novel learning rates for conditional mean embeddings by applying the theory of interpolation for reproducing kernel Hilbert spaces (RKHS) and demonstrates that in certain parameter regimes, it can achieve uniform convergence rates in the output RKHS.
The linear conditional expectation in Hilbert space
The linear conditional expectation (LCE) provides a best linear (or rather, affine) estimate of the conditional expectation and hence plays an important role in approximate Bayesian inference,
The linear conditional expectationin Hilbert space
ILJA KLEBANOV1,*, BJÖRN SPRUNGK2,‡ and T. J. SULLIVAN1,3,† 1Zuse Institute Berlin, Takustraße 7, 14195 Berlin, Germany, E-mail: *klebanov@zib.de; †t.j.sullivan@warwick.ac.uk 2Technische Universität
Adaptive joint distribution learning
This framework accommodates a low-dimensional, positive, and normalized model of a RadonNikodym derivative, estimated from sample sizes of up to several million data points, alleviating the inherent limitations of RKHS modeling.
Conditional Bures Metric for Domain Adaptation (Supplementary Material)
This supplementary material contains the proofs of theorems and some details on the experiment setting: 1) we present the discussions on the proposed method; 2) we show some details on the proposed
Constrained Polynomial Likelihood
A non-negative polynomial minimum-norm likelihood ratio is developed such thatdp=\xi dz satisfies a certain type of shape restrictions and the coefficients of the polynometric are the unique solution of a mixed conic semi-definite program.


A Measure-Theoretic Approach to Kernel Conditional Mean Embeddings
A new operator-free, measure-theoretic definition of the conditional mean embedding as a random variable taking values in a reproducing kernel Hilbert space is presented, and a thorough analysis of its properties, including universal consistency is provided.
On the relation between universality, characteristic kernels and RKHS embedding of measures
The main contribution of this paper is to clarify the relation between universal and characteristic kernels by presenting a unifying study relating them to RKHS embedding of measures, in addition to clarifying their relation to other common notions of strictly pd, conditionally strictly pD and integrally strictlypd kernels.
Hilbert space embeddings of conditional distributions with applications to dynamical systems
This paper derives a kernel estimate for the conditional embedding, and shows its connection to ordinary embeddings, and aims to derive a nonparametric method for modeling dynamical systems where the belief state of the system is maintained as a conditional embeddedding.
Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces
We propose a novel method of dimensionality reduction for supervised learning problems. Given a regression or classification problem in which we wish to predict a response variable Y from an
Nonparametric Bayesian Inference with Kernel Mean Embedding
Kernel methods have been successfully used in many machine learning problems with favorable performance in extracting nonlinear structure of high-dimensional data. Recently, nonparametric inference
Foundations of modern probability
* Measure Theory-Basic Notions * Measure Theory-Key Results * Processes, Distributions, and Independence * Random Sequences, Series, and Averages * Characteristic Functions and Classical Limit
Vector Measures, volume 15 of Mathematical Surveys
  • American Mathematical Society, Providence,
  • 1977
Almost sure convergence of the largest and smallest eigenvalues of high-dimensional sample correlation matrices
In this paper, we show that the largest and smallest eigenvalues of a sample correlation matrix stemming from n independent observations of a p-dimensional time series with iid components converge
Theory of Reproducing Kernels and Applications, volume 44 of Developments in Mathematics
  • 2016
Kernel Bayes' rule: Bayesian inference with positive definite kernels
A kernel method for realizing Bayes' rule is proposed, based on representations of probabilities in reproducing kernel Hilbert spaces, including Bayesian computation without likelihood and filtering with a nonparametric state-space model.