Long Short-Term Memory-Networks for Machine Reading

@inproceedings{Cheng2016LongSM,
  title={Long Short-Term Memory-Networks for Machine Reading},
  author={Jianpeng Cheng and Li Dong and Mirella Lapata},
  booktitle={EMNLP},
  year={2016}
}
In this paper we address the question of how to render sequence-level networks better at handling structured input. We propose a machine reading simulator which processes text incrementally from left to right and performs shallow reasoning with memory and attention. The reader extends the Long Short-Term Memory architecture with a memory network in place of a single memory cell. This enables adaptive memory usage during recurrence with neural attention, offering a way to weakly induce relations… Expand
Neural Semantic Encoders
TLDR
This paper demonstrated the effectiveness and the flexibility of NSE on five different natural language tasks: natural language inference, question answering, sentence classification, document sentiment analysis and machine translation where NSE achieved state-of-the-art performance when evaluated on publically available benchmarks. Expand
Attention-based Memory Selection Recurrent Network for Language Modeling
TLDR
Attention-based Memory Selection Recurrent Network (AMSRN), in which the model can review the information stored in the memory at each previous time step and select the relevant information to help generate the outputs, is proposed. Expand
A memory enhanced LSTM for modeling complex temporal dependencies
In this paper, we present Gamma-LSTM, an enhanced long short term memory (LSTM) unit, to enable learning of hierarchical representations through multiple stages of temporal abstractions. GammaExpand
Self-Attentive Residual Decoder for Neural Machine Translation
TLDR
A target-side-attentive residual recurrent network for decoding, where attention over previous words contributes directly to the prediction of the next word and is able to emphasize any of the previously translated words, hence it gains access to a wider context. Expand
Deep Attentive Structured Language Model based on LSTM
Language model (LM) plays an essential role in natural language processing tasks. Given the context, the language model can predict the next word. However, when the history becomes longer, the singleExpand
Deep Attentive Structured Language Model Based on LSTM
TLDR
A deep attentive structured language model (DAS LM) is proposed, which extends the Long Short-Term Memory (LSTM) neural network with the attention mechanism with the alternative input of part of speech (POS) tags. Expand
Phrase-level Self-Attention Networks for Universal Sentence Encoding
TLDR
Phrase-level Self-Attention Networks (PSAN) that perform self-attention across words inside a phrase to capture context dependencies at the phrase level, and use the gated memory updating mechanism to refine each word’s representation hierarchically with longer-term context dependencies captured in a larger phrase are proposed. Expand
Memory-enhanced Decoder for Neural Machine Translation
TLDR
The memory in this memory-enhanced RNN decoder is a matrix with pre-determined size designed to better capture the information important for the decoding process at each time step, yielding the best performance achieved with the same training set. Expand
Frustratingly Short Attention Spans in Neural Language Modeling
TLDR
This paper proposes a neural language model with a key-value attention mechanism that outputs separate representations for the key and value of a differentiable memory, as well as for encoding the next-word distribution that outperforms existing memory-augmented neural language models on two corpora. Expand
Long short-term memory description and its application in text processing
  • L. Skovajsová
  • Computer Science
  • 2017 Communication and Information Technologies (KIT)
  • 2017
TLDR
The paper describes the state-of-the-art of Long Short-Term Memory neural networks with fixation to context utilization in text documents with wide appliance because it is able to remember preceding states and on the base of them to evaluate the required task. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 94 REFERENCES
Memory Networks
TLDR
This work describes a new class of learning models called memory networks, which reason with inference components combined with a long-term memory component; they learn how to use these jointly. Expand
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
TLDR
The dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers, is introduced. Expand
Recurrent Memory Networks for Language Modeling
TLDR
Recurrent Memory Network (RMN) is proposed, a novel RNN architecture that not only amplifies the power of RNN but also facilitates the understanding of its internal functioning and allows us to discover underlying patterns in data. Expand
Recurrent Memory Network for Language Modeling
TLDR
Recurrent Memory Network (RMN) is proposed, a novel RNN architecture that not only amplifies the power of RNN but also facilitates the understanding of its internal functioning and allows us to discover underlying patterns in data. Expand
End-To-End Memory Networks
TLDR
A neural network with a recurrent attention model over a possibly large external memory that is trained end-to-end, and hence requires significantly less supervision during training, making it more generally applicable in realistic settings. Expand
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
TLDR
The Tree-LSTM is introduced, a generalization of LSTMs to tree-structured network topologies that outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences and sentiment classification. Expand
A Convolutional Neural Network for Modelling Sentences
TLDR
A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations. Expand
Dynamic Memory Networks for Visual and Textual Question Answering
TLDR
The new DMN+ model improves the state of the art on both the Visual Question Answering dataset and the \babi-10k text question-answering dataset without supporting fact supervision. Expand
Depth-Gated LSTM
In this short note, we present an extension of long short-term memory (LSTM) neural networks to using a depth gate to connect memory cells of adjacent layers. Doing so introduces a linear dependenceExpand
A Deep Memory-based Architecture for Sequence-to-Sequence Learning
TLDR
A novel deep architecture for sequence-to-sequence learning, which performs the task through a series of nonlinear transformations from the representation of the input sequence to the final output sequence, and can achieve performance comparable to the traditional phrase-based machine translation system Moses with a small vocabulary and a modest parameter size. Expand
...
1
2
3
4
5
...