Hybrid computing using a neural network with dynamic external memory

  title={Hybrid computing using a neural network with dynamic external memory},
  author={Alex Graves and Greg Wayne and Malcolm Reynolds and Tim Harley and Ivo Danihelka and Agnieszka Grabska-Barwinska and Sergio Gomez Colmenarejo and Edward Grefenstette and Tiago Ramalho and John P. Agapiou and Adri{\`a} Puigdom{\`e}nech Badia and Karl Moritz Hermann and Yori Zwols and Georg Ostrovski and Adam Cain and Helen King and Christopher Summerfield and Phil Blunsom and Koray Kavukcuoglu and Demis Hassabis},
Artificial neural networks are remarkably adept at sensory processing, sequence learning and reinforcement learning, but are limited in their ability to represent variables and data structures and to store data over long timescales, owing to the lack of an external memory. [] Key Result Taken together, our results demonstrate that DNCs have the capacity to solve complex, structured tasks that are inaccessible to neural networks without external read–write memory.

Memory Augmented Neural Networks for Natural Language Processing

This tutorial presents a unified architecture for Memory Augmented Neural Networks (MANN) and discusses the ways in which one can address the external memory and hence read/write from it and introduces a variant of MANN.

Artificial intelligence: Deep neural reasoning

A hybrid learning machine that is composed of a neural network that can read from and write to an external memory structure analogous to the random-access memory in a conventional computer, which can learn to plan routes on the London Underground and achieve goals in a block puzzle, merely by trial and error.

Reinforcement-based Program Induction in a Neural Virtual Machine

The results show that program induction via reinforcement learning is possible using sparse rewards and solely neural computations.

Continual and One-Shot Learning Through Neural Networks with Dynamic External Memory

It is demonstrated that the ENTM is able to perform one-shot learning in reinforcement learning tasks without catastrophic forgetting of previously stored associations and a new ENTM default jump mechanism is introduced that makes it easier to find unused memory location and facilitates the evolution of continual learning networks.

Human Inspired Memory Module for Memory Augmented Neural Networks

  • Amir BidokhtiS. Ghaemmaghami
  • Computer Science
    2022 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)
  • 2022
This work proposes an external memory module, which is composed of two separate submodules for short- and long-term memories, and shows that this dual-memory system outperforms the neural Turing machine in terms of convergence speed and loss.

Distributed Memory based Self-Supervised Differentiable Neural Computer

This work introduces a multiple distributed memory block mechanism that stores information independently to each memory block and uses stored information in a cooperative way for diverse representation and a self-supervised memory loss term which ensures how well a given input is written to the memory.

Teaching recurrent neural networks to infer global temporal structure from local examples

It is demonstrated that a recurrent neural network (RNN) can learn to modify its representation of complex information using only examples, and the associated learning mechanism is explained with new theory.

Teaching Recurrent Neural Networks to Modify Chaotic Memories by Example

It is demonstrated that a recurrent neural network (RNN) can learn to modify its representation of complex information using only examples, and the associated learning mechanism is explained with new theory.

Neural Stored-program Memory

A new memory to store weights for the controller, analogous to the stored-program memory in modern computer architectures is introduced, creating differentiable machines that can switch programs through time, adapt to variable contexts and thus resemble the Universal Turing Machine.

Evolutionary Training and Abstraction Yields Algorithmic Generalization of Neural Computers

The Neural Harvard Computer is presented, a memory-augmented network based architecture that employs abstraction by decoupling algorithmic operations from data manipulations, realized by splitting the information flow and separated modules to enable the learning of robust and scalable algorithmic solutions.



Meta-Learning with Memory-Augmented Neural Networks

The ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples is demonstrated.

Sparse Distributed Memory

Pentti Kanerva's Sparse Distributed Memory presents a mathematically elegant theory of human long term memory that resembles the cortex of the cerebellum, and provides an overall perspective on neural systems.

Human-level control through deep reinforcement learning

This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

Pointer Networks

A new neural architecture to learn the conditional probability of an output sequence with elements that are discrete tokens corresponding to positions in an input sequence using a recently proposed mechanism of neural attention, called Ptr-Nets, which improves over sequence-to-sequence with input attention, but also allows it to generalize to variable size output dictionaries.

Memory Networks

This work describes a new class of learning models called memory networks, which reason with inference components combined with a long-term memory component; they learn how to use these jointly.

Memory traces in dynamical systems

The Fisher Memory Curve is introduced as a measure of the signal-to-noise ratio (SNR) embedded in the dynamical state relative to the input SNR and it is illustrated the generality of the theory by showing that memory in fluid systems can be sustained by transient nonnormal amplification due to convective instability or the onset of turbulence.

Long Short-Term Memory

A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

Human-level concept learning through probabilistic program induction

A computational model is described that learns in a similar fashion and does so better than current deep learning algorithms and can generate new letters of the alphabet that look “right” as judged by Turing-like tests of the model's output in comparison to what real humans produce.

Simple statistical gradient-following algorithms for connectionist reinforcement learning

This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.

Sequence to Sequence Learning with Neural Networks

This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.