• Publications
  • Influence
Overcoming catastrophic forgetting in neural networks
TLDR
It is shown that it is possible to overcome the limitation of connectionist models and train networks that can maintain expertise on tasks that they have not experienced for a long time and selectively slowing down learning on the weights important for previous tasks. Expand
On the difficulty of training recurrent neural networks
TLDR
This paper proposes a gradient norm clipping strategy to deal with exploding gradients and a soft constraint for the vanishing gradients problem and validates empirically the hypothesis and proposed solutions. Expand
A simple neural network module for relational reasoning
TLDR
This work shows how a deep learning architecture equipped with an RN module can implicitly discover and learn to reason about entities and their relations. Expand
Theano: A Python framework for fast computation of mathematical expressions
TLDR
The performance of Theano is compared against Torch7 and TensorFlow on several machine learning models and recently-introduced functionalities and improvements are discussed. Expand
Relational inductive biases, deep learning, and graph networks
TLDR
It is argued that combinatorial generalization must be a top priority for AI to achieve human-like abilities, and that structured representations and computations are key to realizing this objective. Expand
Theano: new features and speed improvements
TLDR
New features and efficiency improvements to Theano are presented, and benchmarks demonstrating Theano's performance relative to Torch7, a recently introduced machine learning library, and to RNNLM, a C++ library targeted at recurrent neural networks. Expand
Progressive Neural Networks
TLDR
This work evaluates this progressive networks architecture extensively on a wide variety of reinforcement learning tasks, and demonstrates that transfer occurs at both low-level sensory and high-level control layers of the learned policy. Expand
Meta-Learning with Latent Embedding Optimization
TLDR
This work shows that latent embedding optimization can achieve state-of-the-art performance on the competitive miniImageNet and tieredImageNet few-shot classification tasks, and indicates LEO is able to capture uncertainty in the data, and can perform adaptation more effectively by optimizing in latent space. Expand
Interaction Networks for Learning about Objects, Relations and Physics
TLDR
The interaction network is introduced, a model which can reason about how objects in complex systems interact, supporting dynamical predictions, as well as inferences about the abstract properties of the system, and is implemented using deep neural networks. Expand
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization
TLDR
This paper proposes a new approach to second-order optimization, the saddle-free Newton method, that can rapidly escape high dimensional saddle points, unlike gradient descent and quasi-Newton methods, and applies this algorithm to deep or recurrent neural network training, and provides numerical evidence for its superior optimization performance. Expand
...
1
2
3
4
5
...