Long Short-Term Memory-Networks for Machine Reading

  title={Long Short-Term Memory-Networks for Machine Reading},
  author={Jianpeng Cheng and Li Dong and Mirella Lapata},
  booktitle={Conference on Empirical Methods in Natural Language Processing},
In this paper we address the question of how to render sequence-level networks better at handling structured input. We propose a machine reading simulator which processes text incrementally from left to right and performs shallow reasoning with memory and attention. The reader extends the Long Short-Term Memory architecture with a memory network in place of a single memory cell. This enables adaptive memory usage during recurrence with neural attention, offering a way to weakly induce relations… 

Figures and Tables from this paper

Neural Semantic Encoders

This paper demonstrated the effectiveness and the flexibility of NSE on five different natural language tasks: natural language inference, question answering, sentence classification, document sentiment analysis and machine translation where NSE achieved state-of-the-art performance when evaluated on publically available benchmarks.

Attention-based Memory Selection Recurrent Network for Language Modeling

Attention-based Memory Selection Recurrent Network (AMSRN), in which the model can review the information stored in the memory at each previous time step and select the relevant information to help generate the outputs, is proposed.

A memory enhanced LSTM for modeling complex temporal dependencies

In this paper, we present Gamma-LSTM, an enhanced long short term memory (LSTM) unit, to enable learning of hierarchical representations through multiple stages of temporal abstractions. Gamma

Self-Attentive Residual Decoder for Neural Machine Translation

A target-side-attentive residual recurrent network for decoding, where attention over previous words contributes directly to the prediction of the next word and is able to emphasize any of the previously translated words, hence it gains access to a wider context.

Deep Attentive Structured Language Model based on LSTM

A deep attentive structured language model (DAS LM) is proposed, which extends the Long Short-Term Memory (LSTM) neural network with the attention mechanism with the alternative input of part of speech (POS) tags.

Deep Attentive Structured Language Model Based on LSTM

A deep attentive structured language model (DAS LM) is proposed, which extends the Long Short-Term Memory (LSTM) neural network with the attention mechanism with the alternative input of part of speech (POS) tags.

Memory-enhanced Decoder for Neural Machine Translation

The memory in this memory-enhanced RNN decoder is a matrix with pre-determined size designed to better capture the information important for the decoding process at each time step, yielding the best performance achieved with the same training set.

Frustratingly Short Attention Spans in Neural Language Modeling

This paper proposes a neural language model with a key-value attention mechanism that outputs separate representations for the key and value of a differentiable memory, as well as for encoding the next-word distribution that outperforms existing memory-augmented neural language models on two corpora.

Long short-term memory description and its application in text processing

  • L. Skovajsová
  • Computer Science
    2017 Communication and Information Technologies (KIT)
  • 2017
The paper describes the state-of-the-art of Long Short-Term Memory neural networks with fixation to context utilization in text documents with wide appliance because it is able to remember preceding states and on the base of them to evaluate the required task.

Learning to Search in Long Documents Using Document Structure

A novel framework for reading comprehension is proposed, which represents documents as trees, and model an agent that learns to interleave quick navigation through the document tree with more expensive answer extraction.



Memory Networks

This work describes a new class of learning models called memory networks, which reason with inference components combined with a long-term memory component; they learn how to use these jointly.

Ask Me Anything: Dynamic Memory Networks for Natural Language Processing

The dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers, is introduced.

Recurrent Memory Networks for Language Modeling

Recurrent Memory Network (RMN) is proposed, a novel RNN architecture that not only amplifies the power of RNN but also facilitates the understanding of its internal functioning and allows us to discover underlying patterns in data.

End-To-End Memory Networks

A neural network with a recurrent attention model over a possibly large external memory that is trained end-to-end, and hence requires significantly less supervision during training, making it more generally applicable in realistic settings.

Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks

The Tree-LSTM is introduced, a generalization of LSTMs to tree-structured network topologies that outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences and sentiment classification.

A Convolutional Neural Network for Modelling Sentences

A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations.

Dynamic Memory Networks for Visual and Textual Question Answering

The new DMN+ model improves the state of the art on both the Visual Question Answering dataset and the \babi-10k text question-answering dataset without supporting fact supervision.

Depth-Gated LSTM

In this short note, we present an extension of long short-term memory (LSTM) neural networks to using a depth gate to connect memory cells of adjacent layers. Doing so introduces a linear dependence

A Deep Memory-based Architecture for Sequence-to-Sequence Learning

A novel deep architecture for sequence-to-sequence learning, which performs the task through a series of nonlinear transformations from the representation of the input sequence to the final output sequence, and can achieve performance comparable to the traditional phrase-based machine translation system Moses with a small vocabulary and a modest parameter size.

Depth-Gated Recurrent Neural Networks

In this short note, we present an extension of LSTM to use a depth gate to connect memory cells of adjacent layers. Doing so introduces a linear dependence between lower and upper recurrent units.