Siamese Recurrent Architectures for Learning Sentence Similarity
@inproceedings{Mueller2016SiameseRA, title={Siamese Recurrent Architectures for Learning Sentence Similarity}, author={Jonas Mueller and Aditya Thyagarajan}, booktitle={AAAI}, year={2016} }
We present a siamese adaptation of the Long Short-Term Memory (LSTM) network for labeled data comprised of pairs of variable-length sequences. [] Key Method For these applications, we provide word-embedding vectors supplemented with synonymic information to the LSTMs, which use a fixed size vector to encode the underlying meaning expressed in a sentence (irrespective of the particular wording/syntax).
698 Citations
Few Shot Transfer Learning BetweenWord Relatedness and Similarity Tasks Using A Gated Recurrent Siamese Network
- Computer ScienceAAAI
- 2018
This paper presents regression models that incorporate both lexical features and word embeddings to produce consistent and competitive results compared to the state of the art, and presents the best performing model across seven of the eight datasets.
Learning Text Similarity with Siamese Recurrent Networks
- Computer ScienceRep4NLP@ACL
- 2016
A deep architecture for learning a similarity metric on variablelength character sequences that combines a stack of character-level bidirectional LSTM’s with a Siamese architecture is presented.
Siamese LSTM with Convolutional Similarity for Similar Question Retrieval
- Computer Science2018 International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)
- 2018
Experiments on large scale real world Yahoo Answers dataset show that 1D-SLcQA outperform the state of the art approach of Siamese cQA approach(SCQA).
Dependency-based Siamese long short-term memory network for learning sentence representations
- Computer SciencePloS one
- 2018
The D-LSTM is proposed, which uses a pre-trained dependency parser to obtain the primary sentence information and generate supporting components, and it also uses a standard LSTM model to generate the basic sentence components.
Siamese Multiplicative LSTM for Semantic Text Similarity
- Computer ScienceACAI
- 2020
The Multiplicative LSTM structure with a Siamese architecture which learn to project word embeddings of each sentence into a fixed-dimensional embedding space to represent this sentence can be used to evaluate the STS task.
Sentence Similarity Prediction based on Siamese CNN-Bidirectional LSTM with Self-attention
- Computer ScienceJournal of KIISE
- 2019
A Siamese CNN-Bidirectional LSTM model that utilizes phonemes instead of words or morphemes and combines long short term memory (LSTM) with 1D convolution neural networks with various window lengths that bind phonemers is proposed.
Long short-term memory network for learning sentences similarity using deep contextual embeddings
- Computer ScienceInternational Journal of Information Technology
- 2021
A deep contextual long semantic textual similarity network is proposed and detailed experimentation and results show that the proposed deep contextual model performs better than the human annotation.
Question-Question Similarity in Online Forums
- Computer ScienceFIRE
- 2018
In this paper, we applied deep learning framework to tackle the tasks of finding duplicate questions. We implemented some models following the siamese architecture using the popular recurrent network…
Exploiting Siamese Neural Networks on Short Text Similarity Tasks for Multiple Domains and Languages
- Computer SciencePROPOR
- 2020
This work adapted the SNN proposed by Mueller and Thyagarajan (2016), and incorporated the architecture to accept three new lexical features and an embedding layer to infer the values of the pre-trained word embeddings, achieving better results than the baseline in both languages and domains.
Learning Question Similarity with Recurrent Neural Networks
- Computer Science2017 IEEE International Conference on Big Knowledge (ICBK)
- 2017
A novel framework for measuring the semantic similarity between sentences based on the architecture of a recurrent neural network (RNN) encoderdecoder, which does not require lexical or syntactic resources is proposed.
References
SHOWING 1-10 OF 36 REFERENCES
Skip-Thought Vectors
- Computer ScienceNIPS
- 2015
We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the…
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
- Computer ScienceACL
- 2015
The Tree-LSTM is introduced, a generalization of LSTMs to tree-structured network topologies that outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences and sentiment classification.
Sequence to Sequence Learning with Neural Networks
- Computer ScienceNIPS
- 2014
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks
- Computer ScienceEMNLP
- 2015
This work proposes a model for comparing sentences that uses a multiplicity of perspectives, first model each sentence using a convolutional neural network that extracts features at multiple levels of granularity and uses multiple types of pooling.
Distributed Representations of Sentences and Documents
- Computer ScienceICML
- 2014
Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models.
Distributed Representations of Words and Phrases and their Compositionality
- Computer ScienceNIPS
- 2013
This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
A large annotated corpus for learning natural language inference
- Computer ScienceEMNLP
- 2015
The Stanford Natural Language Inference corpus is introduced, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning, which allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.
LSTM: A Search Space Odyssey
- Computer ScienceIEEE Transactions on Neural Networks and Learning Systems
- 2017
This paper presents the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling, and observes that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.
Word Embedding Revisited: A New Representation Learning and Explicit Matrix Factorization Perspective
- Computer ScienceIJCAI
- 2015
It is pointed out that SGNS is essentially a representation learning method, which learns to represent the co-occurrence vector for a word, and that extended supervised word embedding can be established based on the proposed representation learning view.
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
- Computer ScienceEMNLP
- 2014
Qualitatively, the proposed RNN Encoder‐Decoder model learns a semantically and syntactically meaningful representation of linguistic phrases.