• Corpus ID: 9127770

Training generative neural networks via Maximum Mean Discrepancy optimization

@article{Dziugaite2015TrainingGN,
  title={Training generative neural networks via Maximum Mean Discrepancy optimization},
  author={Gintare Karolina Dziugaite and Daniel M. Roy and Zoubin Ghahramani},
  journal={ArXiv},
  year={2015},
  volume={abs/1505.03906}
}
We consider training a deep neural network to generate samples from an unknown distribution given i.i.d. data. We frame learning as an optimization minimizing a two-sample test statistic—informally speaking, a good generator network produces samples that cause a two-sample test to fail to reject the null hypothesis. As our two-sample test statistic, we use an unbiased estimate of the maximum mean discrepancy, which is the centerpiece of the nonparametric kernel two-sample test proposed by… 

Figures from this paper

Generative Moment Matching Networks
TLDR
This work forms a method that generates an independent sample via a single feedforward pass through a multilayer perceptron, as in the recently proposed generative adversarial networks, using MMD to learn to generate codes that can then be decoded to produce samples.
Online Kernel based Generative Adversarial Networks
TLDR
It is shown empirically that OKGANs empirically perform dramatically better, with respect to reverse KL-divergence, than other GAN formulations on synthetic data; on classical vision datasets such as MNIST, SVHN, and CelebA, show comparable performance.
Learning to Draw Samples: With Application to Amortized MLE for Generative Adversarial Learning
TLDR
This work proposes an amortized MLE algorithm for training deep energy model, where a neural sampler is adaptively trained to approximate the likelihood function, and obtains realistic-looking images competitive with the state-of-the-art results.
How Well Generative Adversarial Networks Learn Distributions
TLDR
A new notion of regularization is discovered, called the generator-discriminator-pair regularization, that sheds light on the advantage of GANs compared to classical parametric and nonparametric approaches for explicit distribution estimation.
Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy
TLDR
This optimized MMD is applied to the setting of unsupervised learning by generative adversarial networks (GAN), in which a model attempts to generate realistic samples, and a discriminator attempts to tell these apart from data samples.
A Convex Duality Framework for GANs
TLDR
This work develops a convex duality framework for analyzing GANs, and proves that the proposed hybrid divergence changes continuously with the generative model, which suggests regularizing the discriminator's Lipschitz constant in f-GAN and vanilla GAN.
Understanding Estimation and Generalization Error of Generative Adversarial Networks
TLDR
An upper bound as well as a minimax lower bound on the estimation error for training GANs are developed, which justifies the generalization ability of the GAN training via SGM after multiple passes over the data and reflects the interplay between the discriminator and the generator.
OPTIMIZED MAXIMUM MEAN DISCREPANCY
We propose a method to optimize the representation and distinguishability of samples from two probability distributions, by maximizing the estimated power of a statistical test based on the maximum
HOW WELL GENERATIVE ADVERSARIAL NETWORKS LEARN DISTRIBUTIONS1
TLDR
A new notion of regularization is discovered, called the generator-discriminator-pair regularization, that sheds light on the advantage of GANs compared to classical parametric and nonparametric approaches for explicit distribution estimation.
On How Well Generative Adversarial Networks Learn Densities: Nonparametric and Parametric Results
TLDR
The rate of convergence for learning distributions with the adversarial framework and Generative Adversarial Networks (GANs), which subsumes Wasserstein, Sobolev and MMD GANs as special cases are studied.
...
...

References

SHOWING 1-10 OF 17 REFERENCES
Generative Moment Matching Networks
TLDR
This work forms a method that generates an independent sample via a single feedforward pass through a multilayer perceptron, as in the recently proposed generative adversarial networks, using MMD to learn to generate codes that can then be decoded to produce samples.
Generative Adversarial Nets
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a
A Kernel Two-Sample Test
TLDR
This work proposes a framework for analyzing and comparing distributions, which is used to construct statistical tests to determine if two samples are drawn from different distributions, and presents two distribution free tests based on large deviation bounds for the maximum mean discrepancy (MMD).
Deep Boltzmann Machines
TLDR
A new learning algorithm for Boltzmann machines that contain many layers of hidden variables that is made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized with a single bottomup pass.
A Deep and Tractable Density Estimator
TLDR
This work introduces an efficient procedure to simultaneously train a NADE model for each possible ordering of the variables, by sharing parameters across all these models.
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
TLDR
The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and includes detailed algorithms for supervised-learning problem for both regression and classification.
Reducing the Dimensionality of Data with Neural Networks
TLDR
This work describes an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.
Gradient-based learning applied to document recognition
TLDR
This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques.
Injective Hilbert Space Embeddings of Probability Measures
TLDR
This work considers more broadly the problem of specifying characteristic kernels, defined as kernels for which the RKHS embedding of probability measures is injective, and restricts ourselves to translation-invariant kernels on Euclidean space.
A Few Notes on Statistical Learning Theory
  • S. Mendelson
  • Education, Computer Science
    Machine Learning Summer School
  • 2002
TLDR
The focus in this article is on the theoretical side and not on the applicative one; hence, it shall not present examples which may be interesting from the practical point of view but have little theoretical significance.
...
...