Deep Fusion LSTMs for Text Semantic Matching

@article{Liu2016DeepFL,
  title={Deep Fusion LSTMs for Text Semantic Matching},
  author={Pengfei Liu and Xipeng Qiu and Jifan Chen and Xuanjing Huang},
  journal={Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
  year={2016}
}
  • Pengfei Liu, Xipeng Qiu, Xuanjing Huang
  • Published 1 August 2016
  • Computer Science
  • Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Recently, there is rising interest in modelling the interactions of text pair with deep neural networks. [] Key Method Specifically, DF-LSTMs consist of two interdependent LSTMs, each of which models a sequence under the influence of another. We also use external memory to increase the capacity of LSTMs, thereby possibly capturing more complicated matching patterns. Experiments on two very large datasets demonstrate the efficacy of our proposed architecture. Furthermore, we present an elaborate qualitative…

Figures and Tables from this paper

Modelling Interaction of Sentence Pair with Coupled-LSTMs
TLDR
This paper introduces two coupled ways to model the interdependences of two LSTMs, coupling the local contextualized interactions of two sentences and uses a dynamic pooling to select the most informative features.
DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference
TLDR
A novel dependent reading bidirectional LSTM network (DR-BiLSTM) is proposed to efficiently model the relationship between a premise and a hypothesis during encoding and inference in the natural language inference (NLI) task.
Contextualized Non-local Neural Networks for Sequence Learning
TLDR
Experimental results on ten NLP tasks in text classification, semantic matching, and sequence labeling show that the proposed model outperforms competitive baselines and discovers task-specific dependency structures, thus providing better interpretability to users.
Deep bi-directional interaction network for sentence matching
TLDR
A Deep Bi-Directional Interaction Network (DBDIN) is proposed, which captures semantic relatedness from two directions and each direction employs multiple attention-based interaction units, and introduces a self-attention mechanism at last to enhance global matching information with smaller model complexity.
Recurrent Neural Word Segmentation with Tag Inference
TLDR
A Long Short-Term Memory (LSTM) based model is presented for the task of Chinese Weibo word segmentation and introduces a transition score matrix for jumping between tags of successive characters to infer the optimal tag path.
Knowledge Enhanced Hybrid Neural Network for Text Matching
TLDR
Evaluation results from extensive experiments on public data sets of question answering and conversation show that KEHNN can significantly outperform state-of-the-art matching models and particularly improve matching accuracy on pairs with long text.
Identifying High Quality Document-Summary Pairs through Text Matching
TLDR
A novel deep learning method is proposed to identify high quality document–summary pairs for building a large-scale pairs dataset and an improved LSTM-based model is proposed by removing the forget gate in the L STM unit.
Hybrid Attention Based Neural Architecture for Text Semantics Similarity Measurement
TLDR
A neural architecture with hybrid attention mechanism to highlight the important signals in different granularities within a text and an inter-attention component to further consider the influence of one sentence on another when modeling finer granularity interactions is devised.
Swings and Roundabouts: Attention-Structure Interaction Effect in Deep Semantic Matching
  • Amulya Gupta, Zhu Zhang
  • Computer Science
    IEEE/ACM Transactions on Audio, Speech, and Language Processing
  • 2020
TLDR
These findings not only provide intellectual foundations for the popular use of “linear LSTM + attention” architectures in NLP/QA research, but also have implications in other modalities and domains.
Neural Networks for Semantic Textual Similarity
TLDR
This paper constructs a number of models from simple to complex within a framework and reports the results, showing that a small number of LSTM stacks with an L STM stack comparator produces the best results.
...
...

References

SHOWING 1-10 OF 36 REFERENCES
Learning Natural Language Inference with LSTM
TLDR
A special long short-term memory (LSTM) architecture for NLI that remembers important mismatches that are critical for predicting the contradiction or the neutral relationship label and achieves an accuracy of 86.1%, outperforming the state of the art.
ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs
TLDR
This work presents a general Attention Based Convolutional Neural Network (ABCNN) for modeling a pair of sentences and proposes three attention schemes that integrate mutual influence between sentences into CNNs; thus, the representation of each sentence takes into consideration its counterpart.
Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks
TLDR
This work proposes a model for comparing sentences that uses a multiplicity of perspectives, first model each sentence using a convolutional neural network that extracts features at multiple levels of granularity and uses multiple types of pooling.
A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations
TLDR
This work presents a new deep architecture to match two sentences with multiple positional sentence representations, generated by a bidirectional long short term memory (Bi-LSTM).
Convolutional Neural Network for Paraphrase Identification
TLDR
A new deep learning architecture Bi-CNN-MI for paraphrase identification based on the insight that PI requires comparing two sentences on multiple levels of granularity using convolutional neural network and model interaction features at each level is presented.
Recurrent Memory Network for Language Modeling
TLDR
Recurrent Memory Network (RMN) is proposed, a novel RNN architecture that not only amplifies the power of RNN but also facilitates the understanding of its internal functioning and allows us to discover underlying patterns in data.
Convolutional Neural Tensor Network Architecture for Community-Based Question Answering
TLDR
This paper proposes a convolutional neural tensor network architecture to encode the sentences in semantic space and model their interactions with a tensor layer, which outperforms the other methods on two matching tasks.
Sequence to Sequence Learning with Neural Networks
TLDR
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Convolutional Neural Network Architectures for Matching Natural Language Sentences
TLDR
Convolutional neural network models for matching two sentences are proposed, by adapting the convolutional strategy in vision and speech and nicely represent the hierarchical structures of sentences with their layer-by-layer composition and pooling.
Multi-Timescale Long Short-Term Memory Neural Network for Modelling Sentences and Documents
TLDR
A multi-timescale long short-termmemory (MT-LSTM) neural network to model long texts that can model very long documents as well as short sentences is proposed.
...
...