• Corpus ID: 195766830

Understanding Memory Modules on Learning Simple Algorithms

@article{Wang2019UnderstandingMM,
  title={Understanding Memory Modules on Learning Simple Algorithms},
  author={Kexin Wang and Yu Zhou and Shaonan Wang and Jiajun Zhang and Chengqing Zong},
  journal={ArXiv},
  year={2019},
  volume={abs/1907.00820}
}
Recent work has shown that memory modules are crucial for the generalization ability of neural networks on learning simple algorithms. However, we still have little understanding of the working mechanism of memory modules. To alleviate this problem, we apply a two-step analysis pipeline consisting of first inferring hypothesis about what strategy the model has learned according to visualization and then verify it by a novel proposed qualitative analysis method based on dimension reduction… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 20 REFERENCES

Memory Augmented Neural Networks with Wormhole Connections

It is suggested that memory augmented neural networks can reduce the effects of vanishing gradients by creating shortcut (or wormhole) connections to propagate the gradients more effectively and it helps to learn the temporal dependencies.

Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets

The limitations of standard deep learning approaches are discussed and it is shown that some of these limitations can be overcome by learning how to grow the complexity of a model in a structured way.

Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes

This work presents an end-to-end differentiable memory access scheme, which they call Sparse Access Memory (SAM), that retains the representational power of the original approaches whilst training efficiently with very large memories, and achieves asymptotic lower bounds in space and time complexity.

Memory Architectures in Recurrent Neural Network Language Models

The results demonstrate the value of stack-structured memory for explaining the distribution of words in natural language, in line with linguistic theories claiming a context-free backbone for natural language.

Visualisation and 'diagnostic classifiers' reveal how recurrent and recursive neural networks process hierarchical structure

The results indicate that the networks follow a strategy similar to the hypothesised ‘cumulative strategy’, which explains the high accuracy of the network on novel expressions, the generalisation to longer expressions than seen in training, and the mild deterioration with increasing length.

Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes

The D-NTM is evaluated on a set of Facebook bAbI tasks and shown to outperform NTM and LSTM baselines and provide further experimental results on sequential pMNIST, Stanford Natural Language Inference, associative recall and copy tasks.

Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences

A hierarchical model for sequential data that learns a tree on-the-fly, i.e. while reading the sequence, that creates adaptive skip-connections that ease the learning of long-term dependencies is proposed.

Understanding Recurrent Neural State Using Memory Signatures

A network visualization technique to analyze the recurrent state inside the LSTMs/GRUs used commonly in language and acoustic models and extracts knowledge of history encoded in the layers of grapheme-based end-to-end ASR networks.

Understanding Neural Networks through Representation Erasure

This paper proposes a general methodology to analyze and interpret decisions from a neural model by observing the effects on the model of erasing various parts of the representation, such as input word-vector dimensions, intermediate hidden units, or input words.

Neural Semantic Encoders

This paper demonstrated the effectiveness and the flexibility of NSE on five different natural language tasks: natural language inference, question answering, sentence classification, document sentiment analysis and machine translation where NSE achieved state-of-the-art performance when evaluated on publically available benchmarks.