Visualisation and 'diagnostic classifiers' reveal how recurrent and recursive neural networks process hierarchical structure

@article{Hupkes2018VisualisationA,
  title={Visualisation and 'diagnostic classifiers' reveal how recurrent and recursive neural networks process hierarchical structure},
  author={Dieuwke Hupkes and Willem H. Zuidema},
  journal={J. Artif. Intell. Res.},
  year={2018},
  volume={61},
  pages={907-926}
}
In this paper, we investigate how recurrent neural networks can learn and process languages with hierarchical, compositional semantics. To this end, we define the artificial task of processing nested arithmetic expressions, and study whether different types of neural networks can learn to compute their meaning. We find that simple recurrent networks cannot find a generalising solution to this task, but gated recurrent neural networks perform surprisingly well: networks learn to predict the… 

Diagnostic classification and symbolic guidance to understand and improve recurrent neural networks

TLDR
A search through a variety of methods to inspect and understand the internal dynamics of gated recurrent neural networks, using a task focusing on a key feature of language: hierarchical compositionality of meaning, produces a detailed understanding of the computations implemented by the networks to execute their task.

Discovering the Compositional Structure of Vector Representations with Role Learning Networks

TLDR
A novel analysis technique called ROLE is used to show that recurrent neural networks perform well on compositional tasks by converging to solutions which implicitly represent symbolic structure, and uncovers a symbolic structure which closely approximates the encodings of a standard seq2seq network trained to perform the compositional SCAN task.

Do Neural Models Learn Systematicity of Monotonicity Inference in Natural Language?

TLDR
A method for evaluating whether neural models can learn systematicity of monotonicity inference in natural language, namely, the regularity for performing arbitrary inferences with generalization on composition is introduced.

Understanding Memory Modules on Learning Simple Algorithms

TLDR
This work applies a two-step analysis pipeline consisting of first inferring hypothesis about what strategy the model has learned according to visualization and then verifying it by a novel proposed qualitative analysis method based on dimension reduction.

Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information

TLDR
It is shown that ‘diagnostic classifiers’, trained to predict number from the internal states of a language model, provide a detailed understanding of how, when, and where this information is represented, and this knowledge can be used to improve their performance.

USING MULTIPLICATIVE DIFFERENTIAL NEURAL UNITS

TLDR
The effectiveness of the proposed higher-order, memoryaugmented recursive-NN models on two challenging mathematical equation tasks are demonstrated, showing improved extrapolation, stable performance, and faster convergence.

Analysing Representations of Memory Impairment in a Clinical Notes Classification Model

TLDR
This method reveals the types of sentences that lead the model to make incorrect diagnoses and identifies clusters of sentences in the embedding space that correlate strongly with importance scores for each clinical diagnosis class.

Can RNNs learn Recursive Nested Subject-Verb Agreements?

TLDR
A new framework to study recursive processing in RNNs is presented, using subject-verb agreement as a probe into the representations of the neural network, which indicates how neural networks may extract bounded nested tree structures, without learning a systematic recursive rule.

INFORMATION-THEORETIC PROBING EXPLAINS RE-

TLDR
The hypothesis that the extent to which a feature influences a model’s decisions can be predicted using a combination of two factors: the feature’'s extractability after pre-training, and the evidence available during finetuning is tested.

Colorless Green Recurrent Networks Dream Hierarchically

TLDR
The authors' language-model-trained RNNs make reliable predictions about long-distance agreement, and do not lag much behind human performance, bringing support to the hypothesis that RNN's are not just shallow-pattern extractors, but they also acquire deeper grammatical competence.
...

References

SHOWING 1-10 OF 42 REFERENCES

Diagnostic classification and symbolic guidance to understand and improve recurrent neural networks

TLDR
A search through a variety of methods to inspect and understand the internal dynamics of gated recurrent neural networks, using a task focusing on a key feature of language: hierarchical compositionality of meaning, produces a detailed understanding of the computations implemented by the networks to execute their task.

Recursive Neural Networks Can Learn Logical Semantics

TLDR
This work generates artificial data from a logical grammar and uses it to evaluate the models' ability to learn to handle basic relational reasoning, recursive structures, and quantification, suggesting that they can learn suitable representations for logical inference in natural language.

Visualisation and ‘diagnostic classifiers’ reveal how recurrent and recursive neural networks process hierarchical structure

We investigate how neural networks can learn and process languages with hierarchical, compositional semantics. To this end, we define the artifical task of processing nested arithmetic expressions,...

Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks

TLDR
This work presents LSTMVis a visual analysis tool for recurrent neural networks with a focus on understanding these hidden state dynamics and shows several use cases of the tool for analyzing specific hidden state properties on datasets containing nesting, phrase structure, and chord progressions.

Visualizing and Understanding Neural Models in NLP

TLDR
Four strategies for visualizing compositionality in neural models for NLP, inspired by similar work in computer vision, including LSTM-style gates that measure information flow and gradient back-propagation, are described.

Visualizing and Understanding Recurrent Networks

TLDR
This work uses character-level language models as an interpretable testbed to provide an analysis of LSTM representations, predictions and error types, and reveals the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets.

Learning task-dependent distributed representations by backpropagation through structure

  • C. GollerA. Küchler
  • Computer Science
    Proceedings of International Conference on Neural Networks (ICNN'96)
  • 1996
TLDR
A connectionist architecture together with a novel supervised learning scheme which is capable of solving inductive inference tasks on complex symbolic structures of arbitrary size is presented.

The Forest Convolutional Network: Compositional Distributional Semantics with a Neural Chart and without Binarization

TLDR
A new model, the Forest Convolutional Network, is introduced that avoids all of the challenges of current recursive neural network approaches for computing sentence meaning, by taking a parse forest as input, rather than a single tree, and by allowing arbitrary branching factors.

A Convolutional Neural Network for Modelling Sentences

TLDR
A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations.

Simple Recurrent Networks Learn Context-Free and Context-Sensitive Languages by Counting

TLDR
A range of language tasks are shown in which an SRN develops solutions that not only count but also copy and store counting information, demonstrating how SRNs may be an alternative psychological model of language or sequence processing.