Corpus ID: 216080787

Supervised Contrastive Learning

@article{Khosla2020SupervisedCL,
  title={Supervised Contrastive Learning},
  author={Prannay Khosla and Piotr Teterwak and Chen Wang and Aaron Sarna and Yonglong Tian and Phillip Isola and Aaron Maschinot and Ce Liu and Dilip Krishnan},
  journal={ArXiv},
  year={2020},
  volume={abs/2004.11362}
}
Cross entropy is the most widely used loss function for supervised training of image classification models. In this paper, we propose a novel training methodology that consistently outperforms cross entropy on supervised learning tasks across different architectures and data augmentations. We modify the batch contrastive loss, which has recently been shown to be very effective at learning powerful representations in the self-supervised setting. We are thus able to leverage label information… Expand
Class Interference Regularization
TLDR
Class Interference Regularization (CIR) is the first regularization technique to act on the output features of a contrastive loss, and performs on par with the popular label smoothing, as demonstrated for CIFAR-10 and -100. Expand
Does Data Augmentation Benefit from Split BatchNorms
TLDR
A recently proposed training paradigm is explored using an auxiliary BatchNorm for the potentially out-of-distribution, strongly augmented images, and this method significantly improves the performance of common image classification benchmarks such as CIFar-10, CIFAR-100, and ImageNet. Expand
Contrastive Learning with Adversarial Examples
TLDR
A new family of adversarial examples for constrastive learning is introduced and used to define a new adversarial training algorithm for SSL, denoted as CLAE, which improves the performance of several existing CL baselines on multiple datasets. Expand
G-SimCLR: Self-Supervised Contrastive Learning with Guided Projection via Pseudo Labelling
TLDR
This work proposes that, with the normalized temperature-scaled cross-entropy loss function (as used in SimCLR), it is beneficial to not have images of the same category in the same batch, and uses the latent space representation of a denoising autoencoder trained on the unlabeled dataset to obtain pseudo labels. Expand
i-Mix: A Strategy for Regularizing Contrastive Representation Learning
TLDR
It is demonstrated that i-Mix consistently improves the quality of self-supervised representations across domains, resulting in significant performance gains on downstream tasks, and its regularization effect is confirmed via extensive ablation studies across model and dataset sizes. Expand
Contrastive Generative Adversarial Networks
TLDR
A novel conditional contrastive loss to maximize a lower bound on mutual information between samples from the same class to improve conditional image synthesis and robust to network architecture selection. Expand
Adversarial Self-Supervised Contrastive Learning
TLDR
This paper proposes a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples, and presents a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data. Expand
Self-supervised Co-training for Video Representation Learning
TLDR
This paper investigates the benefit of adding semantic-class positives to instance-based Info Noise Contrastive Estimation (InfoNCE) training, and proposes a novel self-supervised co-training scheme to improve the popular infoNCE loss. Expand
Hybrid Discriminative-Generative Training via Contrastive Learning
  • Hao Liu, P. Abbeel
  • Computer Science, Mathematics
  • ArXiv
  • 2020
TLDR
This paper shows that through the perspective of hybrid discriminative-generative training of energy-based models, a direct connection can be made between contrastive learning and supervised learning and shows a specific choice of approximation of the energy- based loss outperforms the existing practice in terms of classification accuracy. Expand
CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding
TLDR
A novel data augmentation framework dubbed CoDA is proposed, which synthesizes diverse and informative augmented examples by integrating multiple transformations organically by introducing a contrastive regularization objective to capture the global relationship among all the data samples. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 76 REFERENCES
Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss
TLDR
A theoretically-principled label-distribution-aware margin (LDAM) loss motivated by minimizing a margin-based generalization bound is proposed that replaces the standard cross-entropy objective during training and can be applied with prior strategies for training with class-imbalance such as re-weighting or re-sampling. Expand
Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels
TLDR
A theoretically grounded set of noise-robust loss functions that can be seen as a generalization of MAE and CCE are presented and can be readily applied with any existing DNN architecture and algorithm, while yielding good performance in a wide range of noisy label scenarios. Expand
Large-Margin Softmax Loss for Convolutional Neural Networks
TLDR
A generalized large-margin softmax (L-Softmax) loss which explicitly encourages intra-class compactness and inter-class separability between learned features and which not only can adjust the desired margin but also can avoid overfitting is proposed. Expand
RandAugment: Practical data augmentation with no separate search
TLDR
RandAugment can be used uniformly across different tasks and datasets and works out of the box, matching or surpassing all previous learned augmentation approaches on CIFAR-10, CIFar-100, SVHN, and ImageNet. Expand
Large Margin Deep Networks for Classification
TLDR
This work proposes a novel loss function to impose a margin on any chosen set of layers of a deep network (including input and hidden layers), and demonstrates that the decision boundary obtained by the loss has nice properties compared to standard classification loss functions. Expand
Cross-Entropy Loss and Low-Rank Features Have Responsibility for Adversarial Examples
State-of-the-art neural networks are vulnerable to adversarial examples; they can easily misclassify inputs that are imperceptibly different than their training and test data. In this work, weExpand
CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features
TLDR
Patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches, and CutMix consistently outperforms state-of-the-art augmentation strategies on CIFAR and ImageNet classification tasks, as well as on ImageNet weakly-supervised localization task. Expand
Unsupervised Feature Learning via Non-parametric Instance Discrimination
TLDR
This work forms this intuition as a non-parametric classification problem at the instance-level, and uses noise-contrastive estimation to tackle the computational challenges imposed by the large number of instance classes. Expand
Improved Deep Metric Learning with Multi-class N-pair Loss Objective
TLDR
This paper proposes a new metric learning objective called multi-class N-pair loss, which generalizes triplet loss by allowing joint comparison among more than one negative examples and reduces the computational burden of evaluating deep embedding vectors via an efficient batch construction strategy using only N pairs of examples. Expand
A Simple Framework for Contrastive Learning of Visual Representations
TLDR
It is shown that composition of data augmentations plays a critical role in defining effective predictive tasks, and introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. Expand
...
1
2
3
4
5
...