Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
- Alec Radford, Luke Metz, Soumith Chintala
- Computer ScienceInternational Conference on Learning…
- 19 November 2015
This work introduces a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrates that they are a strong candidate for unsupervised learning.
BEGAN: Boundary Equilibrium Generative Adversarial Networks
- David Berthelot, Tom Schumm, Luke Metz
- Computer ScienceArXiv
- 31 March 2017
This work proposes a new equilibrium enforcing method paired with a loss derived from the Wasserstein distance for training auto-encoder based Generative Adversarial Networks, which provides a new approximate convergence measure, fast and stable training and high visual quality.
Unrolled Generative Adversarial Networks
- Luke Metz, Ben Poole, David Pfau, Jascha Narain Sohl-Dickstein
- Computer ScienceInternational Conference on Learning…
- 4 November 2016
This work introduces a method to stabilize Generative Adversarial Networks by defining the generator objective with respect to an unrolled optimization of the discriminator, and shows how this technique solves the common problem of mode collapse, stabilizes training of GANs with complex recurrent generators, and increases diversity and coverage of the data distribution by the generator.
Adversarial Spheres
- J. Gilmer, Luke Metz, Ian J. Goodfellow
- Computer ScienceInternational Conference on Learning…
- 9 January 2018
A fundamental tradeoff between the amount of test error and the average distance to nearest error is shown, which proves that any model which misclassifies a small constant fraction of a sphere will be vulnerable to adversarial perturbations of size O(1/\sqrt{d})$.
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
- A. Srivastava, Abhinav Rastogi, Uri Shaham
- Computer ScienceArXiv
- 9 June 2022
Evaluation of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters finds that model performance and calibration both improve with scale, but are poor in absolute terms.
Guided evolutionary strategies: augmenting random search with surrogate gradients
- Niru Maheswaranathan, Luke Metz, G. Tucker, Dami Choi, Jascha Narain Sohl-Dickstein
- Computer ScienceInternational Conference on Machine Learning
- 26 June 2018
This work proposes Guided Evolutionary Strategies, a method for optimally using surrogate gradient directions along with random search, and defines a search distribution for evolutionary strategies that is elongated along a guiding subspace spanned by the surrogate gradients.
Understanding and correcting pathologies in the training of learned optimizers
- Luke Metz, Niru Maheswaranathan, Jeremy Nixon, C. Freeman, Jascha Narain Sohl-Dickstein
- Computer ScienceInternational Conference on Machine Learning
- 24 October 2018
This work proposes a training scheme which overcomes both of these difficulties, by dynamically weighting two unbiased gradient estimators for a variational loss on optimizer performance, allowing us to train neural networks to perform optimization of a specific task faster than tuned first-order methods.
Discrete Sequential Prediction of Continuous Actions for Deep RL
- Luke Metz, Julian Ibarz, N. Jaitly, James Davidson
- Computer ScienceArXiv
- 14 May 2017
This paper shows how Q-values and policies over continuous spaces can be modeled using a next step prediction model over discretized dimensions, and demonstrates empirically that the method can perform global search, which effectively gets around the local optimization issues that plague DDPG.
Meta-Learning Update Rules for Unsupervised Representation Learning
- Luke Metz, Niru Maheswaranathan, Brian Cheung, Jascha Narain Sohl-Dickstein
- Computer ScienceInternational Conference on Learning…
- 31 March 2018
This work target semi-supervised classification performance, and meta-learn an algorithm -- an unsupervised weight update rule -- that produces representations useful for this task that is constrain to be a biologically-motivated, neuron-local function which enables it to generalize to different neural network architectures, datasets, and data modalities.
Towards GAN Benchmarks Which Require Generalization
- Ishaan Gulrajani, Colin Raffel, Luke Metz
- Computer ScienceInternational Conference on Learning…
- 10 January 2020
A necessary condition for an evaluation metric not to behave this way is clarified: estimating the function must require a large sample from the model, so the resulting benchmarks cannot be "won" by training set memorization, while still being perceptually correlated and computable only from samples.
...
...