# Training generative neural networks via Maximum Mean Discrepancy optimization

@article{Dziugaite2015TrainingGN, title={Training generative neural networks via Maximum Mean Discrepancy optimization}, author={Gintare Karolina Dziugaite and Daniel M. Roy and Zoubin Ghahramani}, journal={ArXiv}, year={2015}, volume={abs/1505.03906} }

We consider training a deep neural network to generate samples from an unknown distribution given i.i.d. data. We frame learning as an optimization minimizing a two-sample test statistic—informally speaking, a good generator network produces samples that cause a two-sample test to fail to reject the null hypothesis. As our two-sample test statistic, we use an unbiased estimate of the maximum mean discrepancy, which is the centerpiece of the nonparametric kernel two-sample test proposed by…

## 404 Citations

Generative Moment Matching Networks

- Computer ScienceICML
- 2015

This work forms a method that generates an independent sample via a single feedforward pass through a multilayer perceptron, as in the recently proposed generative adversarial networks, using MMD to learn to generate codes that can then be decoded to produce samples.

Online Kernel based Generative Adversarial Networks

- Computer ScienceArXiv
- 2020

It is shown empirically that OKGANs empirically perform dramatically better, with respect to reverse KL-divergence, than other GAN formulations on synthetic data; on classical vision datasets such as MNIST, SVHN, and CelebA, show comparable performance.

Learning to Draw Samples: With Application to Amortized MLE for Generative Adversarial Learning

- Computer ScienceArXiv
- 2016

This work proposes an amortized MLE algorithm for training deep energy model, where a neural sampler is adaptively trained to approximate the likelihood function, and obtains realistic-looking images competitive with the state-of-the-art results.

How Well Generative Adversarial Networks Learn Distributions

- Computer ScienceJ. Mach. Learn. Res.
- 2021

A new notion of regularization is discovered, called the generator-discriminator-pair regularization, that sheds light on the advantage of GANs compared to classical parametric and nonparametric approaches for explicit distribution estimation.

Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy

- Computer ScienceICLR
- 2017

This optimized MMD is applied to the setting of unsupervised learning by generative adversarial networks (GAN), in which a model attempts to generate realistic samples, and a discriminator attempts to tell these apart from data samples.

A Convex Duality Framework for GANs

- Computer ScienceNeurIPS
- 2018

This work develops a convex duality framework for analyzing GANs, and proves that the proposed hybrid divergence changes continuously with the generative model, which suggests regularizing the discriminator's Lipschitz constant in f-GAN and vanilla GAN.

Understanding Estimation and Generalization Error of Generative Adversarial Networks

- Computer ScienceIEEE Transactions on Information Theory
- 2021

An upper bound as well as a minimax lower bound on the estimation error for training GANs are developed, which justifies the generalization ability of the GAN training via SGM after multiple passes over the data and reflects the interplay between the discriminator and the generator.

OPTIMIZED MAXIMUM MEAN DISCREPANCY

- Computer Science
- 2016

We propose a method to optimize the representation and distinguishability of samples from two probability distributions, by maximizing the estimated power of a statistical test based on the maximum…

HOW WELL GENERATIVE ADVERSARIAL NETWORKS LEARN DISTRIBUTIONS1

- Computer Science
- 2020

A new notion of regularization is discovered, called the generator-discriminator-pair regularization, that sheds light on the advantage of GANs compared to classical parametric and nonparametric approaches for explicit distribution estimation.

On How Well Generative Adversarial Networks Learn Densities: Nonparametric and Parametric Results

- Computer ScienceArXiv
- 2018

The rate of convergence for learning distributions with the adversarial framework and Generative Adversarial Networks (GANs), which subsumes Wasserstein, Sobolev and MMD GANs as special cases are studied.

## References

SHOWING 1-10 OF 17 REFERENCES

Generative Moment Matching Networks

- Computer ScienceICML
- 2015

This work forms a method that generates an independent sample via a single feedforward pass through a multilayer perceptron, as in the recently proposed generative adversarial networks, using MMD to learn to generate codes that can then be decoded to produce samples.

Generative Adversarial Nets

- Computer ScienceNIPS
- 2014

We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a…

A Kernel Two-Sample Test

- Mathematics, Computer ScienceJ. Mach. Learn. Res.
- 2012

This work proposes a framework for analyzing and comparing distributions, which is used to construct statistical tests to determine if two samples are drawn from different distributions, and presents two distribution free tests based on large deviation bounds for the maximum mean discrepancy (MMD).

Deep Boltzmann Machines

- Computer ScienceAISTATS
- 2009

A new learning algorithm for Boltzmann machines that contain many layers of hidden variables that is made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized with a single bottomup pass.

A Deep and Tractable Density Estimator

- Computer ScienceICML
- 2014

This work introduces an efficient procedure to simultaneously train a NADE model for each possible ordering of the variables, by sharing parameters across all these models.

Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)

- Computer Science
- 2005

The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and includes detailed algorithms for supervised-learning problem for both regression and classification.

Reducing the Dimensionality of Data with Neural Networks

- Computer ScienceScience
- 2006

This work describes an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.

Gradient-based learning applied to document recognition

- Computer ScienceProc. IEEE
- 1998

This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques.

Injective Hilbert Space Embeddings of Probability Measures

- Computer Science, MathematicsCOLT
- 2008

This work considers more broadly the problem of specifying characteristic kernels, defined as kernels for which the RKHS embedding of probability measures is injective, and restricts ourselves to translation-invariant kernels on Euclidean space.

A Few Notes on Statistical Learning Theory

- Education, Computer ScienceMachine Learning Summer School
- 2002

The focus in this article is on the theoretical side and not on the applicative one; hence, it shall not present examples which may be interesting from the practical point of view but have little theoretical significance.