Hybrid computing using a neural network with dynamic external memory

@article{Graves2016HybridCU,
  title={Hybrid computing using a neural network with dynamic external memory},
  author={Alex Graves and Greg Wayne and Malcolm Reynolds and Tim Harley and Ivo Danihelka and Agnieszka Grabska-Barwinska and Sergio Gomez Colmenarejo and Edward Grefenstette and Tiago Ramalho and John P. Agapiou and Adri{\`a} Puigdom{\`e}nech Badia and Karl Moritz Hermann and Yori Zwols and Georg Ostrovski and Adam Cain and Helen King and Christopher Summerfield and Phil Blunsom and Koray Kavukcuoglu and Demis Hassabis},
  journal={Nature},
  year={2016},
  volume={538},
  pages={471-476}
}
Artificial neural networks are remarkably adept at sensory processing, sequence learning and reinforcement learning, but are limited in their ability to represent variables and data structures and to store data over long timescales, owing to the lack of an external memory. [...] Key Result Taken together, our results demonstrate that DNCs have the capacity to solve complex, structured tasks that are inaccessible to neural networks without external read–write memory.Expand
Memory Augmented Neural Networks for Natural Language Processing
TLDR
This tutorial presents a unified architecture for Memory Augmented Neural Networks (MANN) and discusses the ways in which one can address the external memory and hence read/write from it and introduces a variant of MANN.
Artificial intelligence: Deep neural reasoning
  • H. Jaeger
  • Computer Science, Medicine
    Nature
  • 2016
TLDR
A hybrid learning machine that is composed of a neural network that can read from and write to an external memory structure analogous to the random-access memory in a conventional computer, which can learn to plan routes on the London Underground and achieve goals in a block puzzle, merely by trial and error.
Reinforcement-based Program Induction in a Neural Virtual Machine
TLDR
The results show that program induction via reinforcement learning is possible using sparse rewards and solely neural computations.
Continual and One-Shot Learning Through Neural Networks with Dynamic External Memory
TLDR
It is demonstrated that the ENTM is able to perform one-shot learning in reinforcement learning tasks without catastrophic forgetting of previously stored associations and a new ENTM default jump mechanism is introduced that makes it easier to find unused memory location and facilitates the evolution of continual learning networks.
Distributed Memory based Self-Supervised Differentiable Neural Computer
TLDR
This work introduces a multiple distributed memory block mechanism that stores information independently to each memory block and uses stored information in a cooperative way for diverse representation and a self-supervised memory loss term which ensures how well a given input is written to the memory.
Teaching recurrent neural networks to infer global temporal structure from local examples
TLDR
It is demonstrated that a recurrent neural network (RNN) can learn to modify its representation of complex information using only examples, and the associated learning mechanism is explained with new theory.
Teaching Recurrent Neural Networks to Modify Chaotic Memories by Example
TLDR
It is demonstrated that a recurrent neural network (RNN) can learn to modify its representation of complex information using only examples, and the associated learning mechanism is explained with new theory.
Neural Stored-program Memory
TLDR
A new memory to store weights for the controller, analogous to the stored-program memory in modern computer architectures is introduced, creating differentiable machines that can switch programs through time, adapt to variable contexts and thus resemble the Universal Turing Machine.
Neurocoder: Learning General-Purpose Computation Using Stored Neural Programs
Artificial Neural Networks are uniquely adroit at machine learning by processing data through a network of artificial neurons. The inter-neuronal connection weights represent the learnt Neural
Evolutionary Training and Abstraction Yields Algorithmic Generalization of Neural Computers
TLDR
The Neural Harvard Computer is presented, a memory-augmented network based architecture that employs abstraction by decoupling algorithmic operations from data manipulations, realized by splitting the information flow and separated modules to enable the learning of robust and scalable algorithmic solutions.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 42 REFERENCES
Meta-Learning with Memory-Augmented Neural Networks
TLDR
The ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples is demonstrated.
Sparse Distributed Memory
TLDR
Pentti Kanerva's Sparse Distributed Memory presents a mathematically elegant theory of human long term memory that resembles the cortex of the cerebellum, and provides an overall perspective on neural systems.
Human-level control through deep reinforcement learning
TLDR
This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Pointer Networks
TLDR
A new neural architecture to learn the conditional probability of an output sequence with elements that are discrete tokens corresponding to positions in an input sequence using a recently proposed mechanism of neural attention, called Ptr-Nets, which improves over sequence-to-sequence with input attention, but also allows it to generalize to variable size output dictionaries.
Memory Networks
TLDR
This work describes a new class of learning models called memory networks, which reason with inference components combined with a long-term memory component; they learn how to use these jointly.
Memory traces in dynamical systems
TLDR
The Fisher Memory Curve is introduced as a measure of the signal-to-noise ratio (SNR) embedded in the dynamical state relative to the input SNR and it is illustrated the generality of the theory by showing that memory in fluid systems can be sustained by transient nonnormal amplification due to convective instability or the onset of turbulence.
Long Short-Term Memory
TLDR
A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
End-To-End Memory Networks
TLDR
A neural network with a recurrent attention model over a possibly large external memory that is trained end-to-end, and hence requires significantly less supervision during training, making it more generally applicable in realistic settings.
Human-level concept learning through probabilistic program induction
TLDR
A computational model is described that learns in a similar fashion and does so better than current deep learning algorithms and can generate new letters of the alphabet that look “right” as judged by Turing-like tests of the model's output in comparison to what real humans produce.
Simple statistical gradient-following algorithms for connectionist reinforcement learning
TLDR
This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.
...
1
2
3
4
5
...