Siamese Recurrent Architectures for Learning Sentence Similarity

@inproceedings{Mueller2016SiameseRA,
  title={Siamese Recurrent Architectures for Learning Sentence Similarity},
  author={Jonas Mueller and Aditya Thyagarajan},
  booktitle={AAAI},
  year={2016}
}
We present a siamese adaptation of the Long Short-Term Memory (LSTM) network for labeled data comprised of pairs of variable-length sequences. [] Key Method For these applications, we provide word-embedding vectors supplemented with synonymic information to the LSTMs, which use a fixed size vector to encode the underlying meaning expressed in a sentence (irrespective of the particular wording/syntax).

Figures and Tables from this paper

Few Shot Transfer Learning BetweenWord Relatedness and Similarity Tasks Using A Gated Recurrent Siamese Network
TLDR
This paper presents regression models that incorporate both lexical features and word embeddings to produce consistent and competitive results compared to the state of the art, and presents the best performing model across seven of the eight datasets.
Learning Text Similarity with Siamese Recurrent Networks
TLDR
A deep architecture for learning a similarity metric on variablelength character sequences that combines a stack of character-level bidirectional LSTM’s with a Siamese architecture is presented.
Siamese LSTM with Convolutional Similarity for Similar Question Retrieval
TLDR
Experiments on large scale real world Yahoo Answers dataset show that 1D-SLcQA outperform the state of the art approach of Siamese cQA approach(SCQA).
Dependency-based Siamese long short-term memory network for learning sentence representations
TLDR
The D-LSTM is proposed, which uses a pre-trained dependency parser to obtain the primary sentence information and generate supporting components, and it also uses a standard LSTM model to generate the basic sentence components.
Siamese Multiplicative LSTM for Semantic Text Similarity
TLDR
The Multiplicative LSTM structure with a Siamese architecture which learn to project word embeddings of each sentence into a fixed-dimensional embedding space to represent this sentence can be used to evaluate the STS task.
Sentence Similarity Prediction based on Siamese CNN-Bidirectional LSTM with Self-attention
TLDR
A Siamese CNN-Bidirectional LSTM model that utilizes phonemes instead of words or morphemes and combines long short term memory (LSTM) with 1D convolution neural networks with various window lengths that bind phonemers is proposed.
Long short-term memory network for learning sentences similarity using deep contextual embeddings
TLDR
A deep contextual long semantic textual similarity network is proposed and detailed experimentation and results show that the proposed deep contextual model performs better than the human annotation.
Question-Question Similarity in Online Forums
In this paper, we applied deep learning framework to tackle the tasks of finding duplicate questions. We implemented some models following the siamese architecture using the popular recurrent network
Exploiting Siamese Neural Networks on Short Text Similarity Tasks for Multiple Domains and Languages
TLDR
This work adapted the SNN proposed by Mueller and Thyagarajan (2016), and incorporated the architecture to accept three new lexical features and an embedding layer to infer the values of the pre-trained word embeddings, achieving better results than the baseline in both languages and domains.
Learning Question Similarity with Recurrent Neural Networks
TLDR
A novel framework for measuring the semantic similarity between sentences based on the architecture of a recurrent neural network (RNN) encoderdecoder, which does not require lexical or syntactic resources is proposed.
...
...

References

SHOWING 1-10 OF 36 REFERENCES
Skip-Thought Vectors
We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
TLDR
The Tree-LSTM is introduced, a generalization of LSTMs to tree-structured network topologies that outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences and sentiment classification.
Sequence to Sequence Learning with Neural Networks
TLDR
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks
TLDR
This work proposes a model for comparing sentences that uses a multiplicity of perspectives, first model each sentence using a convolutional neural network that extracts features at multiple levels of granularity and uses multiple types of pooling.
Distributed Representations of Sentences and Documents
TLDR
Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models.
Distributed Representations of Words and Phrases and their Compositionality
TLDR
This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
A large annotated corpus for learning natural language inference
TLDR
The Stanford Natural Language Inference corpus is introduced, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning, which allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.
LSTM: A Search Space Odyssey
TLDR
This paper presents the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling, and observes that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.
Word Embedding Revisited: A New Representation Learning and Explicit Matrix Factorization Perspective
TLDR
It is pointed out that SGNS is essentially a representation learning method, which learns to represent the co-occurrence vector for a word, and that extended supervised word embedding can be established based on the proposed representation learning view.
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
TLDR
Qualitatively, the proposed RNN Encoder‐Decoder model learns a semantically and syntactically meaningful representation of linguistic phrases.
...
...