SimCSE: Simple Contrastive Learning of Sentence Embeddings

@article{Gao2021SimCSESC,
  title={SimCSE: Simple Contrastive Learning of Sentence Embeddings},
  author={Tianyu Gao and Xingcheng Yao and Danqi Chen},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.08821}
}
This paper presents SimCSE, a simple contrastive learning framework that greatly advances the state-of-the-art sentence embeddings. We first describe an unsupervised approach, which takes an input sentence and predicts itself in a contrastive objective, with only standard dropout used as noise. This simple method works surprisingly well, performing on par with previous supervised counterparts. We find that dropout acts as minimal data augmentation and removing it leads to a representation… 
Improving Contrastive Learning of Sentence Embeddings with Case-Augmented Positives and Retrieved Negatives
TLDR
This work proposes switch-case augmentation to flip the case of the first letter of randomly selected words in a sentence to counteract the intrinsic bias of pre-trained token embeddings to frequency, word cases and subwords.
Pairwise Supervised Contrastive Learning of Sentence Representations
TLDR
PairSupCon, an instance discrimination based approach aiming to bridge semantic entailment and contradiction understanding with high-level categorical concept encoding, is proposed and evaluated on various downstream tasks that involve understanding sentence semantics at different granularities.
ESimCSE: Enhanced Sample Building Method for Contrastive Learning of Unsupervised Sentence Embedding
TLDR
Experimental results show that ESimCSE outperforms the state-of-the-art unsupervised SimCSE by an average Spearman correlation of 2.02% on BERT-base, and draws inspiration from the community of computer vision and introduces a momentum contrast, enlarging the number of negative pairs without additional calculations.
TransAug: Translate as Augmentation for Sentence Embeddings
TLDR
TransAug (Translate as Augmentation), which provide the first exploration of utilizing translated sentence pairs as data augmentation for text, and introduces a two-stage paradigm to advances the state-ofthe-art sentence embeddings.
PCL: Peer-Contrastive Learning with Diverse Augmentations for Unsupervised Sentence Embeddings
TLDR
A novel Peer-Contrastive Learning (PCL) with diverse augmentations is proposed, which can perform peer-positive contrast as well as peer-network cooperation, which offers an inherent anti-bias ability and an effective way to learn from diverse augmentation.
MCSE: Multimodal Contrastive Learning of Sentence Embeddings
TLDR
This work proposes a sentence embedding learning approach that exploits both visual and textual information via a multimodal contrastive objective and shows that this model excels in aligning semantically similar sentences, providing an explanation for its improved performance.
Toward Interpretable Semantic Textual Similarity via Optimal Transport-based Contrastive Sentence Learning
TLDR
This work explicitly describes the sentence distance as the weighted sum of contextualized token distances on the basis of a transportation problem, and presents the optimal transport-based distance measure, named RCMD; it identifies and leverages semantically-aligned token pairs and enhances the quality of sentence similarity and their interpretation.
Pair-Level Supervised Contrastive Learning for Natural Language Inference
TLDR
This paper proposes a Pair-level Supervised Contrastive Learning approach (PairSCL), which adopts a cross attention module to learn the joint representations of the sentence pairs and outperforms the previous state-of-the-art method on seven transfer tasks of text classification.
A Contrastive Framework for Learning Sentence Representations from Pairwise and Triple-wise Perspective in Angular Space
TLDR
This paper proposes a new method ArcCSE, with training objectives designed to enhance the pairwise discriminative power and model the entailment relation of triplet sentences, and demonstrates that this approach outperforms the previous state-of-the-art on diverse sentence related tasks, including STS and SentEval.
Self-Guided Contrastive Learning for BERT Sentence Representations
TLDR
This work fine-tunes BERT in a self-supervised fashion, does not rely on data augmentation, and enables the usual [CLS] token embeddings to function as sentence vectors, and redesigns the contrastive learning objective (NT-Xent) and applies it to sentence representation learning.
...
...

References

SHOWING 1-10 OF 71 REFERENCES
ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer
TLDR
ConSERT is presented, a Contrastive Framework for Self-Supervised SEntence Representation Transfer that adopts contrastive learning to fine-tune BERT in an unsupervised and effective way and achieves new state-of-the-art performance on STS tasks.
Self-Guided Contrastive Learning for BERT Sentence Representations
TLDR
This work fine-tunes BERT in a self-supervised fashion, does not rely on data augmentation, and enables the usual [CLS] token embeddings to function as sentence vectors, and redesigns the contrastive learning objective (NT-Xent) and applies it to sentence representation learning.
A Simple but Tough-to-Beat Baseline for Sentence Embeddings
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations
TLDR
Inspired by recent advances in deep metric learning (DML), this work carefully design a self-supervised objective for learning universal sentence embeddings that does not require labelled training data and closes the performance gap between unsupervised and supervised pretraining for universal sentence encoders.
CLEAR: Contrastive Learning for Sentence Representation
TLDR
This paper proposes Contrastive LEArning for sentence Representation (CLEAR), which employs multiple sentence-level augmentation strategies in order to learn a noise-invariant sentence representation and investigates the key reasons that make contrastive learning effective through numerous experiments.
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
TLDR
Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity is presented.
On the Sentence Embeddings from Pre-trained Language Models
TLDR
This paper proposes to transform the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective and achieves significant performance gains over the state-of-the-art sentence embeddings on a variety of semantic textual similarity tasks.
An Unsupervised Sentence Embedding Method by Mutual Information Maximization
TLDR
Experimental results show that the proposed lightweight extension on top of BERT significantly outperforms other unsupervised sentence embedding baselines on common semantic textual similarity (STS) tasks and downstream supervised tasks, and achieves performance competitive with supervised methods on various tasks.
WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach
TLDR
This work conducts a thorough ex-amination of pretrained model based unsupervised sentence embeddings and concludes that an easy whitening-based vector normalization strategy with less than 10 lines of code consistently boosts the performance.
ParaNMT-50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations
TLDR
This work uses ParaNMT-50M, a dataset of more than 50 million English-English sentential paraphrase pairs, to train paraphrastic sentence embeddings that outperform all supervised systems on every SemEval semantic textual similarity competition, in addition to showing how it can be used for paraphrase generation.
...
...