• Corpus ID: 239016966

Understanding Dimensional Collapse in Contrastive Self-supervised Learning

  title={Understanding Dimensional Collapse in Contrastive Self-supervised Learning},
  author={Li Jing and Pascal Vincent and Yann LeCun and Yuandong Tian},
Self-supervised visual representation learning aims to learn useful representations without relying on human annotations. Joint embedding approach bases on max-imizing the agreement between embedding vectors from different views of the same image. Various methods have been proposed to solve the collapsing problem where all embedding vectors collapse to a trivial constant solution. Among these methods, contrastive learning prevents collapse via negative sample pairs. It has been shown that non… 

Robust Contrastive Learning against Noisy Views

This work proposes a new contrastive loss function that is robust against noisy views and provides rigorous theoretical justifications by showing connections to robust symmetric losses for noisy binary classification and by establishing a new Contrastive bound for mutual information maximization based on the Wasserstein distance measure.

Contrasting the landscape of contrastive and non-contrastive learning

It is shown through theoretical results and controlled experiments that even on simple data models, non-contrastive losses have a preponderance of non-collapsed bad minima, and that the training process does not avoid these minima.

CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

The novel “Contrastive Leave One Out Boost” (CLOOB), which uses modern Hopfield networks for covariance enrichment together with the InfoLOOB objective to mitigate this saturation effect of the InfoNCE objective.

Understanding Contrastive Learning Requires Incorporating Inductive Biases

It is demonstrated that analyses, that ignore inductive biases of the function class and training algorithm, cannot adequately explain the success of contrastive learning, even provably leading to vacuous guarantees in some settings.

Deep Normed Embeddings for Patient Representation

A novel contrastive representation learning objective and a training scheme for clinical time series that avoids the need to compute data augmentations to create similar pairs and shows how the learned embedding can be used for online patient monitoring, can supplement clinicians and improve performance of downstream machine learning tasks.

Survey on Self-Supervised Learning: Auxiliary Pretext Tasks and Contrastive Learning Methods in Imaging

A comprehensive literature review of the top-performing SSL methods using auxiliary pretext and contrastive learning techniques, including how self-supervised methods compare to supervised ones, and then discusses both further considerations and ongoing challenges faced by SSL.

The Power of Contrast for Feature Learning: A Theoretical Analysis

It is provably shown that contrastive learning outperforms autoencoder, a classical unsupervised learning method, for both feature recovery and downstream tasks, and the role of labeled data in supervised contrastivelearning is illustrated.

Understanding Failure Modes of Self-Supervised Learning

The representation space of self-supervised models is studied and a sample-wise Self-Supervised Representation Quality Score (or, Q-Score) is proposed that is able to predict if a given sample is likely to be misclassified in the downstream task, achieving an AUPRC of up to 0.90.

Deep Contrastive Learning is Provably (almost) Principal Component Analysis

We show that Contrastive Learning (CL) under a family of loss functions (including InfoNCE) has a game-theoretical formulation, where the max player finds representation to maximize contrastiveness,

Momentum Contrastive Voxel-wise Representation Learning for Semi-supervised Volumetric Medical Image Segmentation

A novel Contrastive Voxel-wise Representation Distillation (CVRD) method with geometric constraints to learn global-local visual representations for volumetric medical image segmentation with limited annotations and results on the Atrial Segmentation Challenge dataset demonstrate superiority of the proposed scheme.



Understanding self-supervised Learning Dynamics without Contrastive Pairs

This study yields conceptual insights into how non-contrastive SSL methods learn, how they avoid representational collapse, and how multiple factors, like predictor networks, stop-gradients, exponential moving averages, and weight decay all come into play.

Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss

This work proposes a loss that performs spectral decomposition on the population augmentation graph and can be succinctly written as a contrastive learning objective on neural net representations, leading to features with provable accuracy guarantees under linear probe evaluation.

Exploring Simple Siamese Representation Learning

  • Xinlei ChenKaiming He
  • Computer Science
    2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2021
Surprising empirical results are reported that simple Siamese networks can learn meaningful representations even using none of the following: (i) negative sample pairs, (ii) large batches, (iii) momentum encoders.

VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning

This paper introduces VICReg (Variance-Invariance-Covariance Regularization), a method that explicitly avoids the collapse problem with a simple regularization term on the variance of the embeddings along each dimension individually.

On Feature Decorrelation in Self-Supervised Learning

It is verified the existence of complete collapse and another reachable collapse pattern that is usually overlooked, namely dimensional collapse, and is connected with strong correlations between axes and considered as a strong motivation for feature decorrelation.

Unsupervised Learning of Visual Features by Contrasting Cluster Assignments

This paper proposes an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons, and uses a swapped prediction mechanism where it predicts the cluster assignment of a view from the representation of another view.

A Simple Framework for Contrastive Learning of Visual Representations

It is shown that composition of data augmentations plays a critical role in defining effective predictive tasks, and introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning.

Representation Learning with Contrastive Predictive Coding

This work proposes a universal unsupervised learning approach to extract useful representations from high-dimensional data, which it calls Contrastive Predictive Coding, and demonstrates that the approach is able to learn useful representations achieving strong performance on four distinct domains: speech, images, text and reinforcement learning in 3D environments.

Barlow Twins: Self-Supervised Learning via Redundancy Reduction

This work proposes an objective function that naturally avoids collapse by measuring the cross-correlation matrix between the outputs of two identical networks fed with distorted versions of a sample, and making it as close to the identity matrix as possible.

With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations

This work finds that using the nearest-neighbor as positive in contrastive losses improves performance significantly on ImageNet classification using ResNet-50 under the linear evaluation protocol, and demonstrates empirically that the method is less reliant on complex data augmentations.