• Corpus ID: 769926

Kernel-Based Just-In-Time Learning for Passing Expectation Propagation Messages

@inproceedings{Jitkrittum2015KernelBasedJL,
  title={Kernel-Based Just-In-Time Learning for Passing Expectation Propagation Messages},
  author={Wittawat Jitkrittum and Arthur Gretton and Nicolas Manfred Otto Heess and S. M. Ali Eslami and Balaji Lakshminarayanan and D. Sejdinovic and Zolt{\'a}n Szab{\'o}},
  booktitle={UAI},
  year={2015}
}
We propose an efficient nonparametric strategy for learning a message operator in expectation propagation (EP), which takes as input the set of incoming messages to a factor node, and produces an outgoing message as output. This learned operator replaces the multivariate integral required in classical EP, which may not have an analytic expression. We use kernel-based regression, which is trained on a set of probability distributions representing the incoming messages, and the associated… 
Conditional Expectation Propagation
TLDR
Conditional expectation propagation (CEP) is proposed that performs conditional moment matching given the variables outside each message, and then takes expectation w.r.t the approximate posterior of these variables.
Distributed Bayesian Learning with Stochastic Natural Gradient Expectation Propagation and the Posterior Server
TLDR
Stochastic natural gradient expectation propagation is proposed, a novel alternative to expectation propagation, a popular variational inference algorithm, and a novel architecture for distributed Bayesian learning which is called the posterior server.
Stochastic Expectation Propagation
TLDR
Stochastic expectation propagation is presented, called SEP, that maintains a global posterior approximation but updates it in a local way (like EP), and is ideally suited to performing approximate Bayesian learning in the large model, large dataset setting.
Scalable, Flexible and Active Learning on Distributions
TLDR
This thesis investigates approximate embeddings into Euclideanspaces such that inner products in the embedding space approximate kernel values between the source distributions, and provides a greater understanding of the standard tool for doing so on Euclidean inputs, random Fourier features.
Linear-Time Learning on Distributions with Approximate Kernel Embeddings
TLDR
This work develops the first random features for pdfs whose dot product approximates kernels using these non-Euclidean metrics, allowing estimators to scale to large datasets by working in a primal space, without computing large Gram matrices.
Kernel-based distribution features for statistical tests and Bayesian inference
TLDR
The main focus of this thesis is on the development of linear-time mean-embedding-based methods to automatically extract informative features of data distributions, for statistical tests and Bayesian inference, and the use of random Fourier features to construct approximate kernel mean embeddings.
Importance Weighting Approach in Kernel Bayes' Rule
TLDR
This work studies a nonparametric approach to Bayesian computation via feature means, where the expectation of prior features is updated to yield expected posterior features, based on regression from kernel or neural net features of the observations, resulting in a novel instance of a kernel Bayes’ rule.
Kernel Mean Embedding of Distributions: A Review and Beyonds
TLDR
A comprehensive review of existing work and recent advances in the Hilbert space embedding of distributions, and to discuss the most challenging issues and open problems that could lead to new research directions.
Proposal : Scalable , Active and Flexible Learning on Distributions
TLDR
This thesis investigates the approach of approximate embeddings into Euclidean spaces such that inner products in the embedding space approximate kernel values between the source distributions and proposes to adapt recent kernel learning techniques to the distributional setting, allowing the automatic selection of good kernels for the task at hand.
Discriminative Embeddings of Latent Variable Models for Structured Data
TLDR
In applications involving millions of data points, it is shown that structure2vec runs 2 times faster, produces models which are 10, 000 times smaller, while at the same time achieving the state-of-the-art predictive performance.
...
...

References

SHOWING 1-10 OF 34 REFERENCES
Learning to Pass Expectation Propagation Messages
TLDR
This work studies whether it is possible to automatically derive fast and accurate EP updates by learning a discriminative model to map EP message inputs to EP message outputs, and provides empirical analysis on several challenging and diverse factors, indicating that there is a space of factors where this approach appears promising.
Kernel Belief Propagation
TLDR
K Kernel Belief Propagation is faster than competing classical and nonparametric approaches (by orders of magnitude, in some cases), while providing significantly more accurate results.
Scalable Kernel Methods via Doubly Stochastic Gradients
TLDR
An approach that scales up kernel methods using a novel concept called "doubly stochastic functional gradients" based on the fact that many kernel methods can be expressed as convex optimization problems, which can readily scale kernel methods up to the regimes which are dominated by neural nets.
ABC-EP: Expectation Propagation for Likelihoodfree Bayesian Computation
TLDR
It is shown that Expectation Propagation, a widely successful approximate inference technique, can be adapted to the likelihood-free context and the resulting algorithm does not require summary statistics, is an order of magnitude faster than existing techniques, and remains usable when prior information is vague.
Learning Theory for Distribution Regression
TLDR
This paper studies a simple, analytically computable, ridge regression-based alternative to distribution regression, where the distributions are embedded to a reproducing kernel Hilbert space, and the regressor is learned from the embeddings to the outputs, establishing the consistency of the classical set kernel.
Gaussian Processes for Machine Learning
TLDR
The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and deals with the supervised learning problem for both regression and classification.
Just-In-Time Learning for Fast and Flexible Inference
TLDR
This work introduces just-in-time learning, a framework for fast and flexible inference that learns to speed up inference at run-time and shows how this framework can allow us to combine the flexibility of sampling with the efficiency of deterministic message-passing.
Kernels for Vector-Valued Functions: a Review
TLDR
This monograph reviews different methods to design or learn valid kernel functions for multiple outputs, paying particular attention to the connection between probabilistic and functional methods.
A family of algorithms for approximate Bayesian inference
TLDR
This thesis presents an approximation technique that can perform Bayesian inference faster and more accurately than previously possible, and is found to be convincingly better than rival approximation techniques: Monte Carlo, Laplace's method, and variational Bayes.
Universal Kernels on Non-Standard Input Spaces
TLDR
This work provides a general technique based on Taylor-type kernels to explicitly construct universal kernels on compact metric spaces which are not subset of ℝd, and applies this technique for the following special cases: universal kernel on the set of probability measures, universal kernels based on Fourier transforms, and universal kernels for signal processing.
...
...