# Investigating the Role of Negatives in Contrastive Representation Learning

@article{Ash2021InvestigatingTR, title={Investigating the Role of Negatives in Contrastive Representation Learning}, author={J. T. Ash and Surbhi Goel and A. Krishnamurthy and Dipendra Kumar Misra}, journal={ArXiv}, year={2021}, volume={abs/2106.09943} }

Noise contrastive learning is a popular technique for unsupervised representation learning. In this approach, a representation is obtained via reduction to supervised learning, where given a notion of semantic similarity, the learner tries to distinguish a similar (positive) example from a collection of random (negative) examples. The success of modern contrastive learning pipelines relies on many parameters such as the choice of data augmentation, the number of negative examples, and the batch… Expand

#### Figures and Tables from this paper

#### References

SHOWING 1-10 OF 33 REFERENCES

A Theoretical Analysis of Contrastive Unsupervised Representation Learning

- Computer Science, Mathematics
- ICML
- 2019

This framework allows us to show provable guarantees on the performance of the learned representations on the average classification task that is comprised of a subset of the same set of latent classes and shows that learned representations can reduce (labeled) sample complexity on downstream tasks. Expand

Contrastive Representation Learning: A Framework and Review

- Computer Science, Mathematics
- IEEE Access
- 2020

A general Contrastive Representation Learning framework is proposed that simplifies and unifies many different contrastive learning methods and a taxonomy for each of the components is provided in order to summarise and distinguish it from other forms of machine learning. Expand

LRC-BERT: Latent-representation Contrastive Knowledge Distillation for Natural Language Understanding

- Computer Science
- AAAI
- 2021

This work proposes a knowledge distillation method LRC-BERT based on contrastive learning to fit the output of the intermediate layer from the angular distance aspect, which is not considered by the existing distillation methods. Expand

On Mutual Information Maximization for Representation Learning

- Computer Science, Mathematics
- ICLR
- 2020

This paper argues, and provides empirical evidence, that the success of these methods cannot be attributed to the properties of MI alone, and that they strongly depend on the inductive bias in both the choice of feature extractor architectures and the parametrization of the employed MI estimators. Expand

Contrastive Estimation: Training Log-Linear Models on Unlabeled Data

- Computer Science
- ACL
- 2005

A novel approach, contrastive estimation, is described, which outperforms EM, is more robust to degradations of the dictionary, and can largely recover by modeling additional features. Expand

Contrastive Learning of Structured World Models

- Computer Science, Mathematics
- ICLR
- 2020

These experiments demonstrate that C-SWMs can overcome limitations of models based on pixel reconstruction and outperform typical representatives of this model class in highly structured environments, while learning interpretable object-based representations. Expand

Contrastive estimation reveals topic posterior information to linear models

- Computer Science, Mathematics
- ArXiv
- 2020

It is proved that contrastive learning is capable of recovering a representation of documents that reveals their underlying topic posterior information to linear models and empirically that linear classifiers with these representations perform well in document classification tasks with very few training examples. Expand

Representation Learning with Contrastive Predictive Coding

- Computer Science, Mathematics
- ArXiv
- 2018

This work proposes a universal unsupervised learning approach to extract useful representations from high-dimensional data, which it calls Contrastive Predictive Coding, and demonstrates that the approach is able to learn useful representations achieving strong performance on four distinct domains: speech, images, text and reinforcement learning in 3D environments. Expand

Predicting What You Already Know Helps: Provable Self-Supervised Learning

- Computer Science, Mathematics
- ArXiv
- 2020

This paper quantifies how approximate independence between the components of the pretext task (conditional on the label and latent variables) allows us to learn representations that can solve the downstream task with drastically reduced sample complexity by just training a linear layer on top of the learned representation. Expand

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

- Computer Science
- ICLR
- 2020

The contextual representations learned by the proposed replaced token detection pre-training task substantially outperform the ones learned by methods such as BERT and XLNet given the same model size, data, and compute. Expand