• Corpus ID: 12305768

Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention

  title={Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention},
  author={Yang Liu and Chengjie Sun and Lei Lin and Xiaolong Wang},
In this paper, we proposed a sentence encoding-based model for recognizing text entailment. [] Key Method Firstly, average pooling was used over word-level bidirectional LSTM (biLSTM) to generate a first-stage sentence representation. Secondly, attention mechanism was employed to replace average pooling on the same sentence for better representations.

Figures and Tables from this paper

An Improved Mechanism for Universal Sentence Representations Learnt from Natural Language Inference Data Using Bi-directional Information
  • Dian Jiao, Sheng Gao, Baodong Zhang
  • Computer Science
    Proceedings of the 2019 International Conference on Computer, Network, Communication and Information Systems (CNCI 2019)
  • 2019
An improved pooling mechanism based on max pooling for universal sentence encoder is proposed, which uses three kinds of methods to refine the backward and forward information at each time step and then uses a max-pooling layer or attention mechanism to obtain a fixed-size sentence representation from variable-length refined hidden states.
Natural language inference using LSTM model with sentence fusion
The results demonstrate that the LSTM with Sentence Fusion which reads premise and hypothesis to produce a final fusion representation from which a three-way classifier predicts label has a better performance than L STM RNN encoders and Lexicalized classifier.
Recurrent Neural Network-Based Sentence Encoder with Gated Attention for Natural Language Inference
This paper describes a model (alpha) that is ranked among the top in the Shared Task, on both the in- domain test set and on the cross-domain test set, demonstrating that the model generalizes well to theCross-domain data.
Double Attention Mechanism for Sentence Embedding
Experimental results show that the proposed model yields a significant performance gain compared to other sentence embedding methods in all the three dataset, and the model can be trained end-to-end with limited hyper-parameters.
A Sentence-to-Sentence Relation Network for Recognizing Textual Entailment
A sentence encoding model that exploits the sentence-to-sentence relation information for RTE and combines the strength of RNN and CNN to present a unified model for the RTE task.
Phrase-level Self-Attention Networks for Universal Sentence Encoding
Phrase-level Self-Attention Networks (PSAN) that perform self-attention across words inside a phrase to capture context dependencies at the phrase level, and use the gated memory updating mechanism to refine each word’s representation hierarchically with longer-term context dependencies captured in a larger phrase are proposed.
Multi-task Learning for Universal Sentence Embeddings: A Thorough Evaluation using Transfer and Auxiliary Tasks
This paper shows that joint learning of multiple tasks results in better generalizable sentence representations by conducting extensive experiments and analysis comparing the multi-task and single-task learned sentence encoders.
CGSPN : cascading gated self-attention and phrase-attention network for sentence modeling
A Cascading Gated Self-attention and Phrase-att attention Network (CGSPN) is proposed that generates the sentence embedding by considering contextual words and key phrases in a sentence by abstracting the semantic of phrases.
Enhancing Sentence Embedding with Generalized Pooling
A vector-based multi-head attention that includes the widely used max pooling, mean pooled, and scalar self-attention as special cases is proposed that achieves significant improvement over strong sentence-encoding-based methods.
Attention-Fused Deep Matching Network for Natural Language Inference
An attention-fused deep matching network (AF-DMN) for natural language inference that takes two sentences as input and iteratively learns the attention-aware representations for each side by multi-level interactions and adds a self-attention mechanism to fully exploit local context information within each sentence.


LSTM-based Deep Learning Models for non-factoid answer selection
A general deep learning framework is applied for the answer selection task, which does not depend on manually defined features or linguistic tools, and is extended in two directions to define a more composite representation for questions and answers.
Reasoning about Entailment with Neural Attention
This paper proposes a neural model that reads two sentences to determine entailment using long short-term memory units and extends this model with a word-by-word neural attention mechanism that encourages reasoning over entailments of pairs of words and phrases, and presents a qualitative analysis of attention weights produced by this model.
Sequence to Sequence Learning with Neural Networks
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Natural Language Inference by Tree-Based Convolution and Heuristic Matching
This model, a tree-based convolutional neural network (TBCNN) captures sentence-level semantics; then heuristic matching layers like concatenation, element-wise product/difference combine the information in individual sentences.
A large annotated corpus for learning natural language inference
The Stanford Natural Language Inference corpus is introduced, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning, which allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.
Recognizing Entailment and Contradiction by Tree-based Convolution
Experimental results on a large dataset verify the rationale of using TBCNN as the sentencelevel model; leveraging additional heuristics like element-wise product/difference further improves the accuracy.
A Fast Unified Model for Parsing and Sentence Understanding
The Stack-augmentedParser-Interpreter NeuralNetwork (SPINN) combines parsing and interpretation within a single tree-sequence hybrid model by integrating tree-structured sentence interpretation into the linear sequential structure of a shiftreduceparser.
Learning to Execute
This work developed a new variant of curriculum learning that improved the networks' performance in all experimental conditions and had a dramatic impact on an addition problem, making an LSTM to add two 9-digit numbers with 99% accuracy.
GloVe: Global Vectors for Word Representation
A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Recognising Textual Entailment with Logical Inference
This work incorporates model building, a technique borrowed from automated reasoning, and shows that it is a useful robust method to approximate entailment, and uses machine learning to combine these deep semantic analysis techniques with simple shallow word overlap.