• Corpus ID: 231979539

Mine Your Own vieW: Self-Supervised Learning Through Across-Sample Prediction

  title={Mine Your Own vieW: Self-Supervised Learning Through Across-Sample Prediction},
  author={Mehdi Azabou and Mohammad Gheshlaghi Azar and Ran Liu and Chi-Heng Lin and Erik C. Johnson and Kiran Bhaskaran-Nair and Max Dabagia and Keith B. Hengen and William Gray-Roncal and Michal Valko and Eva L. Dyer},
State-of-the-art methods for self-supervised learning (SSL) build representations by maximizing the similarity between different augmented “views” of a sample. Because these approaches try to match views of the same sample, they can be too myopic and fail to produce meaningful results when augmentations are not sufficiently rich. This motivates the use of the dataset itself to find similar, yet distinct, samples to serve as views for one another. In this paper, we introduce Mine Your Own vieW… 

Figures and Tables from this paper

Self-Supervised Learning of Pose-Informed Latents

This paper argues that extending the augmentation strategy by using different frames of a video leads to more powerful representations, and defines a novel methodology for new challenging tasks such as zero shot pose estimation.

Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity

This work introduces a novel unsupervised approach for learning disentangled representations of neural activity called Swap-VAE, which combines a generative modeling framework with an instance-specific alignment loss that tries to maximize the representational similarity between transformed views of the input (brain state).

Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity

  • Ran Liu
  • Computer Science, Biology
  • 2021
This work introduces a novel unsupervised approach for learning disentangled representations of neural activity called Swap-VAE that combines a generative modeling framework with an instance-specific alignment loss that tries to maximize the representational similarity between transformed views of the input (brain state).

With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations

This work finds that using the nearest-neighbor as positive in contrastive losses improves performance significantly on ImageNet classification using ResNet-50 under the linear evaluation protocol, and demonstrates empirically that the method is less reliant on complex data augmentations.

Constrained Mean Shift Using Distant Yet Related Neighbors for Representation Learning

This work generalizes the mean-shift idea by constraining the search space of NNs using another source of knowledge so that NNs are far from the query while still being semantically related.

Using self-supervision and augmentations to build insights into neural coding

Recent progress in the application of self-supervised learning to data analysis in neuroscience is highlighted, the implications of these results are discussed, and ways in which SSL might be applied to reveal interesting properties of neural computation are suggested.

On Feature Decorrelation in Self-Supervised Learning

It is verified the existence of complete collapse and another reachable collapse pattern that is usually overlooked, namely dimensional collapse, and is connected with strong correlations between axes and considered as a strong motivation for feature decorrelation.

Cross-Trajectory Representation Learning for Zero-Shot Generalization in RL

This work proposes Cross Trajectory Representation Learning (CTRL), a method that runs within an RL agent and conditions its encoder to recognize behavioral similarity in observations by applying a novel SSL objective to pairs of trajectories from the agent’s policies.

Learning Behavior Representations Through Multi-Timescale Bootstrapping

This work introduces Bootstrap Across Multiple Scales (BAMS), a multi-scale representation learning model for behavior that combines a pooling module that aggregates features extracted over encoders with different temporal receptive features to bootstrap the representations in each respective space to encourage disentanglement across different timescales.

Capturing cross-session neural population variability through self-supervised identification of consistent neuron ensembles

It is shown that self-supervised training of a deep neural network can be used to compensate for this inter-session variability, and a sequential autoencoding model can maintain state-of-the-art behaviour decoding performance for completely unseen recording sessions several days into the future.



Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning

This work introduces Bootstrap Your Own Latent (BYOL), a new approach to self-supervised image representation learning that performs on par or better than the current state of the art on both transfer and semi- supervised benchmarks.

Whitening for Self-Supervised Representation Learning

This paper proposes a different direction and a new loss function for self-supervised learning which is based on the whitening of the latent-space features and empirically shows that this loss accelerates self- supervised training and the learned representations are much more effective for downstream tasks than previously published work.

What makes for good views for contrastive learning

This paper uses empirical analysis to better understand the importance of view selection, and argues that the mutual information (MI) between views should be reduced while keeping task-relevant information intact, and devise unsupervised and semi-supervised frameworks that learn effective views by aiming to reduce their MI.

BYOL works even without batch statistics

Replacing BN with a batch-independent normalization scheme (namely, a combination of group normalization and weight standardization) achieves performance comparable to vanilla BYOL, and disproves the hypothesis that the use of batch statistics is a crucial ingredient for BYOL to learn useful representations.

Self-supervised Label Augmentation via Input Transformations

This paper proposes a novel knowledge transfer technique, which it refers to as self-distillation, that has the effect of the aggregated inference in a single (faster) inference and demonstrates the large accuracy improvement and wide applicability of the framework on various fully-supervised settings.

Data-Efficient Reinforcement Learning with Momentum Predictive Representations

This work trains an agent to predict its own latent state representations multiple steps into the future using an encoder which is an exponential moving average of the agent's parameters, and makes predictions using a learned transition model.

Learning by Association — A Versatile Semi-Supervised Training Method for Neural Networks

This work proposes a new framework for semi-supervised training of deep neural networks inspired by learning in humans and demonstrates the capabilities of learning by association on several data sets and shows that it can improve performance on classification tasks tremendously by making use of additionally available unlabeled data.

Local Aggregation for Unsupervised Learning of Visual Embeddings

This work describes a method that trains an embedding function to maximize a metric of local aggregation, causing similar data instances to move together in the embedding space, while allowing dissimilar instances to separate.

Barlow Twins: Self-Supervised Learning via Redundancy Reduction

This work proposes an objective function that naturally avoids collapse by measuring the cross-correlation matrix between the outputs of two identical networks fed with distorted versions of a sample, and making it as close to the identity matrix as possible.

Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning

Learning a good representation is an essential component for deep reinforcement learning (RL). Representation learning is especially important in multitask and partially observable settings where