Towards Generalizable Sentence Embeddings

  title={Towards Generalizable Sentence Embeddings},
  author={Eleni Triantafillou and Jamie Ryan Kiros and Raquel Urtasun and Richard S. Zemel},
In this work, we evaluate different sentence encoders with emphasis on examining their embedding spaces. Specifically, we hypothesize that a “high-quality” embedding aids in generalization, promoting transfer learning as well as zero-shot and one-shot learning. To investigate this, we modify Skipthought vectors to learn a more generalizable space by exploiting a small amount of supervision. The aim is to introduce an additional notion of similarity in the embeddings, rendering the vectors… 

Trimming and Improving Skip-thought Vectors

It is found that a good word embedding initialization is also essential for learning better sentence representations, and the proposed model is a faster, lighter-weight and equally powerful alternative to the original skip-thought model.

Enforcing Compositionality in Sentence Embedding Models

Preliminary results are presented on a data augmentation method that helps sentence embedding models to learn the recursive compositional property of language, which adds an element to the training objective, which is called compositionality loss.

Semantically Aligned Sentence-Level Embeddings for Agent Autonomy and Natural Language Understanding

This work presents a novel embedding model designed and trained specifically for the purpose of “reasoning in the linguistic domain”, which performs competitively on the SemEval 2013 benchmark and outperform state-of-the-art models on two key semantic discernment tasks introduced in Chapter 8.

Assessing Composition in Sentence Vector Representations

This work introduces a specialized sentence generation system that produces large, annotated sentence sets meeting specified syntactic, semantic and lexical constraints and finds that the method is able to extract useful information about the differing capacities of these models.

EcForest: Extractive document summarization through enhanced sentence embedding and cascade forest

The evaluation of variant models proposed in this work proves the validation of the enhanced sentence embedding and shows that the proposed summarization model performs better than or with high competitiveness to the state‐of‐the‐art.

Rethinking Skip-thought: A Neighborhood based Approach

It is found that, incorporating an autoencoder path in the model didn’t aid the model to perform better, while it hurts the performance of the skip-thought model.

Latent Semantic Analysis Approach for Document Summarization Based on Word Embeddings

Two embedding-based weighting schemes are proposed and then combined to calculate the values of the Latent Semantic Analysis input matrix, modified versions of the augment weight and the entropy frequency that combine the strength of traditional weighter schemes and word embedding.

ParaPhraser: Russian paraphrase corpus and shared task

Results of the task reflect the following tendencies: the best scores are obtained by the strategy of using traditional classifiers combined with fine-grained linguistic features, however, complex neural networks, shallow methods and purely technical methods also demonstrate competitive results.

An end-to-end Neural Network Framework for Text Clustering

A pure neural framework for text clustering in an end-to-end manner that jointly learns the text representation and the clustering model that outperforms previous clustering methods by a large margin.

An Integrated Graph Model for Document Summarization

An integrated graph model (iGraph) for extractive text summarization is proposed, using an enhanced embedding model to detect the inherent semantic properties at the word level, bigram level and trigram level.



Towards Universal Paraphrastic Sentence Embeddings

This work considers the problem of learning general-purpose, paraphrastic sentence embeddings based on supervision from the Paraphrase Database, and compares six compositional architectures, finding that the most complex architectures, such as long short-term memory (LSTM) recurrent neural networks, perform best on the in-domain data.

Skip-Thought Vectors

We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the

Improving Distributional Similarity with Lessons Learned from Word Embeddings

It is revealed that much of the performance gains of word embeddings are due to certain system design choices and hyperparameter optimizations, rather than the embedding algorithms themselves, and these modifications can be transferred to traditional distributional models, yielding similar gains.

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network.

A Hierarchical Neural Autoencoder for Paragraphs and Documents

This paper introduces an LSTM model that hierarchically builds an embedding for a paragraph from embeddings for sentences and words, then decodes this embedding to reconstruct the original paragraph and evaluates the reconstructed paragraph using standard metrics to show that neural models are able to encode texts in a way that preserve syntactic, semantic, and discourse coherence.

An Autoencoder Approach to Learning Bilingual Word Representations

This work explores the use of autoencoder-based methods for cross-language learning of vectorial word representations that are coherent between two languages, while not relying on word-level alignments, and achieves state-of-the-art performance.

A large annotated corpus for learning natural language inference

The Stanford Natural Language Inference corpus is introduced, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning, which allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.

Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection

This work introduces a method for paraphrase detection based on recursive autoencoders (RAE) and unsupervised RAEs based on a novel unfolding objective and learns feature vectors for phrases in syntactic trees to measure word- and phrase-wise similarity between two sentences.

Learning to Understand Phrases by Embedding the Dictionary

This work proposes using the definitions found in everyday dictionaries as a means of bridging the gap between lexical and phrasal semantics, and presents two applications of these architectures: reverse dictionaries that return the name of a concept given a definition or description and general-knowledge crossword question answerers.

Learning Distributed Representations of Sentences from Unlabelled Data

A systematic comparison of models that learn distributed phrase or sentence representations from unlabelled data finds that the optimal approach depends critically on the intended application.