Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention
@article{Liu2016LearningNL, title={Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention}, author={Yang Liu and Chengjie Sun and Lei Lin and Xiaolong Wang}, journal={ArXiv}, year={2016}, volume={abs/1605.09090} }
In this paper, we proposed a sentence encoding-based model for recognizing text entailment. [] Key Method Firstly, average pooling was used over word-level bidirectional LSTM (biLSTM) to generate a first-stage sentence representation. Secondly, attention mechanism was employed to replace average pooling on the same sentence for better representations.
235 Citations
An Improved Mechanism for Universal Sentence Representations Learnt from Natural Language Inference Data Using Bi-directional Information
- Computer ScienceProceedings of the 2019 International Conference on Computer, Network, Communication and Information Systems (CNCI 2019)
- 2019
An improved pooling mechanism based on max pooling for universal sentence encoder is proposed, which uses three kinds of methods to refine the backward and forward information at each time step and then uses a max-pooling layer or attention mechanism to obtain a fixed-size sentence representation from variable-length refined hidden states.
Natural language inference using LSTM model with sentence fusion
- Computer Science2017 36th Chinese Control Conference (CCC)
- 2017
The results demonstrate that the LSTM with Sentence Fusion which reads premise and hypothesis to produce a final fusion representation from which a three-way classifier predicts label has a better performance than L STM RNN encoders and Lexicalized classifier.
Recurrent Neural Network-Based Sentence Encoder with Gated Attention for Natural Language Inference
- Computer ScienceRepEval@EMNLP
- 2017
This paper describes a model (alpha) that is ranked among the top in the Shared Task, on both the in- domain test set and on the cross-domain test set, demonstrating that the model generalizes well to theCross-domain data.
Double Attention Mechanism for Sentence Embedding
- Computer ScienceWISA
- 2018
Experimental results show that the proposed model yields a significant performance gain compared to other sentence embedding methods in all the three dataset, and the model can be trained end-to-end with limited hyper-parameters.
A Sentence-to-Sentence Relation Network for Recognizing Textual Entailment
- Computer Science
- 2016
A sentence encoding model that exploits the sentence-to-sentence relation information for RTE and combines the strength of RNN and CNN to present a unified model for the RTE task.
Phrase-level Self-Attention Networks for Universal Sentence Encoding
- Computer ScienceEMNLP
- 2018
Phrase-level Self-Attention Networks (PSAN) that perform self-attention across words inside a phrase to capture context dependencies at the phrase level, and use the gated memory updating mechanism to refine each word’s representation hierarchically with longer-term context dependencies captured in a larger phrase are proposed.
Multi-task Learning for Universal Sentence Embeddings: A Thorough Evaluation using Transfer and Auxiliary Tasks
- Computer Science
- 2018
This paper shows that joint learning of multiple tasks results in better generalizable sentence representations by conducting extensive experiments and analysis comparing the multi-task and single-task learned sentence encoders.
CGSPN : cascading gated self-attention and phrase-attention network for sentence modeling
- Computer ScienceJournal of Intelligent Information Systems
- 2020
A Cascading Gated Self-attention and Phrase-att attention Network (CGSPN) is proposed that generates the sentence embedding by considering contextual words and key phrases in a sentence by abstracting the semantic of phrases.
Enhancing Sentence Embedding with Generalized Pooling
- Computer ScienceCOLING
- 2018
A vector-based multi-head attention that includes the widely used max pooling, mean pooled, and scalar self-attention as special cases is proposed that achieves significant improvement over strong sentence-encoding-based methods.
Attention-Fused Deep Matching Network for Natural Language Inference
- Computer ScienceIJCAI
- 2018
An attention-fused deep matching network (AF-DMN) for natural language inference that takes two sentences as input and iteratively learns the attention-aware representations for each side by multi-level interactions and adds a self-attention mechanism to fully exploit local context information within each sentence.
References
SHOWING 1-10 OF 22 REFERENCES
LSTM-based Deep Learning Models for non-factoid answer selection
- Computer ScienceArXiv
- 2015
A general deep learning framework is applied for the answer selection task, which does not depend on manually defined features or linguistic tools, and is extended in two directions to define a more composite representation for questions and answers.
Reasoning about Entailment with Neural Attention
- Computer ScienceICLR
- 2016
This paper proposes a neural model that reads two sentences to determine entailment using long short-term memory units and extends this model with a word-by-word neural attention mechanism that encourages reasoning over entailments of pairs of words and phrases, and presents a qualitative analysis of attention weights produced by this model.
Sequence to Sequence Learning with Neural Networks
- Computer ScienceNIPS
- 2014
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Natural Language Inference by Tree-Based Convolution and Heuristic Matching
- Computer ScienceACL
- 2016
This model, a tree-based convolutional neural network (TBCNN) captures sentence-level semantics; then heuristic matching layers like concatenation, element-wise product/difference combine the information in individual sentences.
A large annotated corpus for learning natural language inference
- Computer ScienceEMNLP
- 2015
The Stanford Natural Language Inference corpus is introduced, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning, which allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.
Recognizing Entailment and Contradiction by Tree-based Convolution
- Computer ScienceArXiv
- 2015
Experimental results on a large dataset verify the rationale of using TBCNN as the sentencelevel model; leveraging additional heuristics like element-wise product/difference further improves the accuracy.
A Fast Unified Model for Parsing and Sentence Understanding
- Computer ScienceACL
- 2016
The Stack-augmentedParser-Interpreter NeuralNetwork (SPINN) combines parsing and interpretation within a single tree-sequence hybrid model by integrating tree-structured sentence interpretation into the linear sequential structure of a shiftreduceparser.
Learning to Execute
- Computer ScienceArXiv
- 2014
This work developed a new variant of curriculum learning that improved the networks' performance in all experimental conditions and had a dramatic impact on an addition problem, making an LSTM to add two 9-digit numbers with 99% accuracy.
GloVe: Global Vectors for Word Representation
- Computer ScienceEMNLP
- 2014
A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Recognising Textual Entailment with Logical Inference
- Computer ScienceHLT
- 2005
This work incorporates model building, a technique borrowed from automated reasoning, and shows that it is a useful robust method to approximate entailment, and uses machine learning to combine these deep semantic analysis techniques with simple shallow word overlap.