• Corpus ID: 247519082

Bag of Instances Aggregation Boosts Self-supervised Distillation

  title={Bag of Instances Aggregation Boosts Self-supervised Distillation},
  author={Haohang Xu and Jiemin Fang and Xiaopeng Zhang and Lingxi Xie and Xinggang Wang and Wenrui Dai and Hongkai Xiong and Qi Tian},
Recent advances in self-supervised learning have experienced remarkable progress, especially for contrastive learning based methods, which regard each image as well as its augmentations as an individual class and try to distinguish them from all other images. However, due to the large quantity of exemplars, this kind of pretext task intrinsically suffers from slow convergence and is hard for optimization. This is especially true for small scale models, which we find the performance drops… 

Relational Self-Supervised Learning

This paper introduces a novel SSL paradigm, which is term as relational self-supervised learning (ReSSL) framework that learns representations by modeling the relationship between different instances by employing sharpened distribution of pairwise similarities among different instances as relation metric.

Slimmable Networks for Contrastive Self-supervised Learning

Self-supervised learning makes great progress in large model pre-training but suf-fers in training small models. Previous solutions to this problem mainly rely on knowledge distillation and indeed

Dual Contrastive Learning for Spatio-temporal Representation

spatio-temporal Extensive experiments demonstrate that DCLR learns effective spatio-temporal representations and obtains state-of-the-art or comparable performance on UCF-101, HMDB-51, and Diving-48



DisCo: Remedy Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning

Experimental results demonstrate that the DisCo method surpasses the state-of-the-art on all lightweight models by a large margin, and it is proposed to enlarge the embedding dimension to alleviate the problem of Distilling BottleNeck.

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.

ImageNet: A large-scale hierarchical image database

A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

SEED: Self-supervised Distillation For Visual Representation

This paper proposes a new learning paradigm, named SElf-SupErvised Distillation (SEED), where a larger network is leverage to transfer its representational knowledge into a smaller architecture in a self-supervised fashion, and shows that SEED dramatically boosts the performance of small networks on downstream tasks.

CompRess: Self-Supervised Learning by Compressing Representations

This work develops a model compression method to compress an already learned, deep self-supervised model (teacher) to a smaller one (student), which outperforms all previous methods including the fully supervised model on ImageNet linear evaluation and on nearest neighbor evaluation.

Big Self-Supervised Models are Strong Semi-Supervised Learners

The proposed semi-supervised learning algorithm can be summarized in three steps: unsupervised pretraining of a big ResNet model using SimCLRv2 (a modification of SimCLRs), supervised fine-tuning on a few labeled examples, and distillation with unlabeled examples for refining and transferring the task-specific knowledge.

Improved Baselines with Momentum Contrastive Learning

With simple modifications to MoCo, this note establishes stronger baselines that outperform SimCLR and do not require large training batches, and hopes this will make state-of-the-art unsupervised learning research more accessible.

A Simple Framework for Contrastive Learning of Visual Representations

It is shown that composition of data augmentations plays a critical role in defining effective predictive tasks, and introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning.

Deep Residual Learning for Image Recognition

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

Mask R-CNN

This work presents a conceptually simple, flexible, and general framework for object instance segmentation that outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners.