Comparison and Combination of Sentence Embeddings Derived from Different Supervision Signals

  title={Comparison and Combination of Sentence Embeddings Derived from Different Supervision Signals},
  author={Hayato Tsukagoshi and Ryohei Sasano and Koichi Takeda},
There have been many successful applications of sentence embedding methods.However, it has not been well understood what properties are captured in the resulting sentence embeddings depending on the supervision signals.In this paper, we focus on two types of sentence embedding methods with similar architectures and tasks: one fine-tunes pre-trained language models on the natural language inference task, and the other fine-tunes pre-trained language models on word prediction task from its… 

Figures and Tables from this paper

KP-USE: An Unsupervised Approach for Key-Phrases Extraction from Documents
KP-USE makes use of the Universal Sentence Encoder (USE) as an embedding method for text representation and its performance outperforms recent AKE methods which are based on embedding techniques.


On the Sentence Embeddings from Pre-trained Language Models
This paper proposes to transform the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective and achieves significant performance gains over the state-of-the-art sentence embeddings on a variety of semantic textual similarity tasks.
TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning
This work presents a new state-of-the-art unsupervised method based on pre-trained Transformers and Sequential Denoising Auto-Encoder (TSDAE) which outperforms previous approaches by up to 6.4 points and shows that TSDAE is a strong domain adaptation and pre-training method for sentence embeddings, outperforming other approaches like Masked Language Model.
ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer
ConSERT is presented, a Contrastive Framework for Self-Supervised SEntence Representation Transfer that adopts contrastive learning to fine-tune BERT in an unsupervised and effective way and achieves new state-of-the-art performance on STS tasks.
DefSent: Sentence Embeddings using Definition Sentences
DefSent is a sentence embedding method that uses definition sentences from a word dictionary, which performs comparably on unsupervised semantics textual similarity (STS) tasks and slightly better on SentEval tasks than conventional methods.
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations
Inspired by recent advances in deep metric learning (DML), this work carefully design a self-supervised objective for learning universal sentence embeddings that does not require labelled training data and closes the performance gap between unsupervised and supervised pretraining for universal sentence encoders.
SimCSE: Simple Contrastive Learning of Sentence Embeddings
SimCSE is presented, a simple contrastive learning framework that greatly advances the state-of-the-art sentence embeddings and regularizes pre-trainedembeddings’ anisotropic space to be more uniform, and it better aligns positive pairs when supervised signals are available.
Universal Sentence Encoder
It is found that transfer learning using sentence embeddings tends to outperform word level transfer with surprisingly good performance with minimal amounts of supervised training data for a transfer task.
SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation
The STS Benchmark is introduced as a new shared training and evaluation set carefully selected from the corpus of English STS shared task data (2012-2017), providing insight into the limitations of existing models.
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
It is shown how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks.
Skip-Thought Vectors
We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the