Improved Training of Wasserstein GANs
- Ishaan Gulrajani, Faruk Ahmed, Martín Arjovsky, Vincent Dumoulin, Aaron C. Courville
- Computer ScienceNIPS
- 31 March 2017
This work proposes an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input, which performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning.
Wasserstein Generative Adversarial Networks
- Martín Arjovsky, Soumith Chintala, L. Bottou
- Computer ScienceInternational Conference on Machine Learning
- 17 July 2017
This work introduces a new algorithm named WGAN, an alternative to traditional GAN training that can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches.
Invariant Risk Minimization
- Martín Arjovsky, L. Bottou, Ishaan Gulrajani, David Lopez-Paz
- Computer ScienceArXiv
- 5 July 2019
This work introduces Invariant Risk Minimization, a learning paradigm to estimate invariant correlations across multiple training distributions and shows how the invariances learned by IRM relate to the causal structures governing the data and enable out-of-distribution generalization.
Towards Principled Methods for Training Generative Adversarial Networks
- Martín Arjovsky, L. Bottou
- Computer ScienceInternational Conference on Learning…
- 17 January 2017
The goal of this paper is to make theoretical steps towards fully understanding the training dynamics of generative adversarial networks, and performs targeted experiments to substantiate the theoretical analysis and verify assumptions, illustrate claims, and quantify the phenomena.
Adversarially Learned Inference
- Vincent Dumoulin, Ishmael Belghazi, Aaron C. Courville
- Computer ScienceInternational Conference on Learning…
- 2 June 2016
The adversarially learned inference (ALI) model is introduced, which jointly learns a generation network and an inference network using an adversarial process and the usefulness of the learned representations is confirmed by obtaining a performance competitive with state-of-the-art on the semi-supervised SVHN and CIFAR10 tasks.
Unitary Evolution Recurrent Neural Networks
- Martín Arjovsky, Amar Shah, Yoshua Bengio
- Computer ScienceInternational Conference on Machine Learning
- 20 November 2015
This work constructs an expressive unitary weight matrix by composing several structured matrices that act as building blocks with parameters to be learned, and demonstrates the potential of this architecture by achieving state of the art results in several hard tasks involving very long-term dependencies.
Never Give Up: Learning Directed Exploration Strategies
- Adrià Puigdomènech Badia, P. Sprechmann, C. Blundell
- Computer ScienceInternational Conference on Learning…
- 14 February 2020
This work constructs an episodic memory-based intrinsic reward using k-nearest neighbors over the agent's recent experience to train the directed exploratory policies, thereby encouraging the agent to repeatedly revisit all states in its environment.
Symplectic Recurrent Neural Networks
- Zhengdao Chen, Jianyu Zhang, Martín Arjovsky, L. Bottou
- Computer ScienceInternational Conference on Learning…
- 29 September 2019
It is shown that SRNNs succeed reliably on complex and noisy Hamiltonian systems, and how to augment the SRNN integration scheme in order to handle stiff dynamical systems such as bouncing billiards.
Simple data balancing achieves competitive worst-group-accuracy
- Badr Youbi Idrissi, Martín Arjovsky, M. Pezeshki, David Lopez-Paz
- Computer ScienceCLEaR
- 27 October 2021
The results show that these data balancing baselines achieve state-of-the-art-accuracy, while being faster to train and requiring no additional hyper-parameters.
Linear unit-tests for invariance discovery
- Benjamin Aubin, A. Slowik, Martín Arjovsky, L. Bottou, David Lopez-Paz
- MathematicsArXiv
- 22 February 2021
The purpose of this note is to propose six linear low-dimensional problems —“unit tests”— to evaluate different types of out-of-distribution generalization in a precise manner and it is hoped that these unit tests become a standard stepping stone for researchers in out- of-dist distribution generalization.
...
...