Deep Fusion LSTMs for Text Semantic Matching
@article{Liu2016DeepFL, title={Deep Fusion LSTMs for Text Semantic Matching}, author={Pengfei Liu and Xipeng Qiu and Jifan Chen and Xuanjing Huang}, journal={Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)}, year={2016} }
Recently, there is rising interest in modelling the interactions of text pair with deep neural networks. [] Key Method Specifically, DF-LSTMs consist of two interdependent LSTMs, each of which models a sequence under the influence of another. We also use external memory to increase the capacity of LSTMs, thereby possibly capturing more complicated matching patterns. Experiments on two very large datasets demonstrate the efficacy of our proposed architecture. Furthermore, we present an elaborate qualitative…
Figures and Tables from this paper
64 Citations
Modelling Interaction of Sentence Pair with Coupled-LSTMs
- Computer ScienceEMNLP
- 2016
This paper introduces two coupled ways to model the interdependences of two LSTMs, coupling the local contextualized interactions of two sentences and uses a dynamic pooling to select the most informative features.
DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference
- Computer ScienceNAACL
- 2018
A novel dependent reading bidirectional LSTM network (DR-BiLSTM) is proposed to efficiently model the relationship between a premise and a hypothesis during encoding and inference in the natural language inference (NLI) task.
Contextualized Non-local Neural Networks for Sequence Learning
- Computer ScienceAAAI
- 2019
Experimental results on ten NLP tasks in text classification, semantic matching, and sequence labeling show that the proposed model outperforms competitive baselines and discovers task-specific dependency structures, thus providing better interpretability to users.
Deep bi-directional interaction network for sentence matching
- Computer ScienceAppl. Intell.
- 2021
A Deep Bi-Directional Interaction Network (DBDIN) is proposed, which captures semantic relatedness from two directions and each direction employs multiple attention-based interaction units, and introduces a self-attention mechanism at last to enhance global matching information with smaller model complexity.
Recurrent Neural Word Segmentation with Tag Inference
- Computer ScienceNLPCC/ICCPOL
- 2016
A Long Short-Term Memory (LSTM) based model is presented for the task of Chinese Weibo word segmentation and introduces a transition score matrix for jumping between tags of successive characters to infer the optimal tag path.
Knowledge Enhanced Hybrid Neural Network for Text Matching
- Computer ScienceAAAI
- 2018
Evaluation results from extensive experiments on public data sets of question answering and conversation show that KEHNN can significantly outperform state-of-the-art matching models and particularly improve matching accuracy on pairs with long text.
Identifying High Quality Document-Summary Pairs through Text Matching
- Computer ScienceInf.
- 2017
A novel deep learning method is proposed to identify high quality document–summary pairs for building a large-scale pairs dataset and an improved LSTM-based model is proposed by removing the forget gate in the L STM unit.
Hybrid Attention Based Neural Architecture for Text Semantics Similarity Measurement
- Computer ScienceDASFAA
- 2020
A neural architecture with hybrid attention mechanism to highlight the important signals in different granularities within a text and an inter-attention component to further consider the influence of one sentence on another when modeling finer granularity interactions is devised.
Swings and Roundabouts: Attention-Structure Interaction Effect in Deep Semantic Matching
- Computer ScienceIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2020
These findings not only provide intellectual foundations for the popular use of “linear LSTM + attention” architectures in NLP/QA research, but also have implications in other modalities and domains.
Neural Networks for Semantic Textual Similarity
- Computer ScienceICON
- 2017
This paper constructs a number of models from simple to complex within a framework and reports the results, showing that a small number of LSTM stacks with an L STM stack comparator produces the best results.
References
SHOWING 1-10 OF 36 REFERENCES
Learning Natural Language Inference with LSTM
- Computer ScienceNAACL
- 2016
A special long short-term memory (LSTM) architecture for NLI that remembers important mismatches that are critical for predicting the contradiction or the neutral relationship label and achieves an accuracy of 86.1%, outperforming the state of the art.
ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs
- Computer ScienceTransactions of the Association for Computational Linguistics
- 2016
This work presents a general Attention Based Convolutional Neural Network (ABCNN) for modeling a pair of sentences and proposes three attention schemes that integrate mutual influence between sentences into CNNs; thus, the representation of each sentence takes into consideration its counterpart.
Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks
- Computer ScienceEMNLP
- 2015
This work proposes a model for comparing sentences that uses a multiplicity of perspectives, first model each sentence using a convolutional neural network that extracts features at multiple levels of granularity and uses multiple types of pooling.
A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations
- Computer ScienceAAAI
- 2016
This work presents a new deep architecture to match two sentences with multiple positional sentence representations, generated by a bidirectional long short term memory (Bi-LSTM).
Convolutional Neural Network for Paraphrase Identification
- Computer ScienceNAACL
- 2015
A new deep learning architecture Bi-CNN-MI for paraphrase identification based on the insight that PI requires comparing two sentences on multiple levels of granularity using convolutional neural network and model interaction features at each level is presented.
Recurrent Memory Network for Language Modeling
- Computer ScienceArXiv
- 2016
Recurrent Memory Network (RMN) is proposed, a novel RNN architecture that not only amplifies the power of RNN but also facilitates the understanding of its internal functioning and allows us to discover underlying patterns in data.
Convolutional Neural Tensor Network Architecture for Community-Based Question Answering
- Computer ScienceIJCAI
- 2015
This paper proposes a convolutional neural tensor network architecture to encode the sentences in semantic space and model their interactions with a tensor layer, which outperforms the other methods on two matching tasks.
Sequence to Sequence Learning with Neural Networks
- Computer ScienceNIPS
- 2014
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Convolutional Neural Network Architectures for Matching Natural Language Sentences
- Computer ScienceNIPS
- 2014
Convolutional neural network models for matching two sentences are proposed, by adapting the convolutional strategy in vision and speech and nicely represent the hierarchical structures of sentences with their layer-by-layer composition and pooling.
Multi-Timescale Long Short-Term Memory Neural Network for Modelling Sentences and Documents
- Computer ScienceEMNLP
- 2015
A multi-timescale long short-termmemory (MT-LSTM) neural network to model long texts that can model very long documents as well as short sentences is proposed.