Corpus ID: 162168848

Data-Efficient Image Recognition with Contrastive Predictive Coding

@article{Hnaff2020DataEfficientIR,
  title={Data-Efficient Image Recognition with Contrastive Predictive Coding},
  author={Olivier J. H{\'e}naff and A. Srinivas and Jeffrey De Fauw and Ali Razavi and Carl Doersch and S. M. Ali Eslami and A{\"a}ron van den Oord},
  journal={ArXiv},
  year={2020},
  volume={abs/1905.09272}
}
Large scale deep learning excels when labeled images are abundant, yet data-efficient learning remains a longstanding challenge. [...] Key Result We expect these results to open the door to pipelines that use scalable unsupervised representations as a drop-in replacement for supervised ones for real-world vision tasks where labels are scarce.Expand
Efficient Visual Pretraining with Contrastive Detection
TLDR
This work introduces a new self-supervised objective, contrastive detection, which tasks representations with identifying object-level features across augmentations, leading to state-of-the-art transfer performance from ImageNet to COCO, while requiring up to 5× less pretraining. Expand
Generative Pretraining From Pixels
TLDR
This work trains a sequence Transformer to auto-regressively predict pixels, without incorporating knowledge of the 2D input structure, and finds that a GPT-2 scale model learns strong image representations as measured by linear probing, fine-tuning, and low-data classification. Expand
Combining PENCIL with AMDIM for image classification with noisy and sparsely labeled data
TLDR
This paper evaluates the effectiveness of combining PENCIL, a framework for correcting noisy labels during training and AMDIM, a self-supervised technique for learning good data representations from unlabeled data, and finds that this combination is significantly more effective when dealing with sparse and noisy labels, compared to using either of these approaches alone. Expand
Self-supervised Visual Feature Learning and Classification Framework: Based on Contrastive Learning
TLDR
A Self-supervised Visual Feature Learning and Classification framework that can be applied to large scale training data without annotation that outperforms commonly used approaches and serves as a baseline algorithm for future improvement to the unsupervised learning paradigm. Expand
When Does Contrastive Visual Representation Learning Work?
Recent self-supervised representation learning techniques have largely closed the gap between supervised and unsupervised learning on ImageNet classification. While the particulars of pretraining onExpand
Demystifying Contrastive Self-Supervised Learning: Invariances, Augmentations and Dataset Biases
TLDR
This work demonstrates that approaches like MOCO and PIRL learn occlusion-invariant representations, but they fail to capture viewpoint and category instance invariance which are crucial components for object recognition, and proposes an approach to leverage unstructured videos to learn representations that possess higher viewpoint invariance. Expand
Momentum Contrast for Unsupervised Visual Representation Learning
We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning as dictionary look-up, we build a dynamic dictionary with a queue and aExpand
PATCHFORMER: A NEURAL ARCHITECTURE FOR SELF-SUPERVISED REPRESENTATION LEARNING ON IMAGES
  • 2019
Learning rich representations from predictive learning without labels has been a longstanding challenge in the field of machine learning. Generative pre-training has so far not been as successful asExpand
Rethinking Image Mixture for Unsupervised Visual Representation Learning
TLDR
Despite its conceptual simplicity, it is shown empirically that with the simple solution -- image mixture, the authors can learn more robust visual representations from the transformed input, and the benefits of representations learned from this space can be inherited by the linear classification and downstream tasks. Expand
A Framework using Contrastive Learning for Classification with Noisy Labels
TLDR
An extensive empirical study showing that a preliminary contrastive learning step brings a significant gain in performance when using different loss functions: non robust, robust, and early-learning regularized. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 118 REFERENCES
Unsupervised Representation Learning by Predicting Image Rotations
TLDR
This work proposes to learn image features by training ConvNets to recognize the 2d rotation that is applied to the image that it gets as input, and demonstrates both qualitatively and quantitatively that this apparently simple task actually provides a very powerful supervisory signal for semantic feature learning. Expand
Discriminative Unsupervised Feature Learning with Convolutional Neural Networks
TLDR
This paper presents an approach for training a convolutional neural network using only unlabeled data and trains the network to discriminate between a set of surrogate classes, finding that this simple feature learning algorithm is surprisingly successful when applied to visual object recognition. Expand
Unsupervised Visual Representation Learning by Context Prediction
TLDR
It is demonstrated that the feature representation learned using this within-image context indeed captures visual similarity across images and allows us to perform unsupervised visual discovery of objects like cats, people, and even birds from the Pascal VOC 2011 detection dataset. Expand
Momentum Contrast for Unsupervised Visual Representation Learning
We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning as dictionary look-up, we build a dynamic dictionary with a queue and aExpand
Unsupervised Feature Learning via Non-parametric Instance Discrimination
TLDR
This work forms this intuition as a non-parametric classification problem at the instance-level, and uses noise-contrastive estimation to tackle the computational challenges imposed by the large number of instance classes. Expand
Representation Learning with Contrastive Predictive Coding
TLDR
This work proposes a universal unsupervised learning approach to extract useful representations from high-dimensional data, which it calls Contrastive Predictive Coding, and demonstrates that the approach is able to learn useful representations achieving strong performance on four distinct domains: speech, images, text and reinforcement learning in 3D environments. Expand
Learning Features by Watching Objects Move
TLDR
Inspired by the human visual system, low-level motion-based grouping cues can be used to learn an effective visual representation that significantly outperforms previous unsupervised approaches across multiple settings, especially when training data for the target task is scarce. Expand
Learning Image Representations Tied to Ego-Motion
TLDR
This work proposes to exploit proprioceptive motor signals to provide unsupervised regularization in convolutional neural networks to learn visual representations from egocentric video to enforce that the authors' learned features exhibit equivariance, i.e, they respond predictably to transformations associated with distinct ego-motions. Expand
Deep Residual Learning for Image Recognition
TLDR
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. Expand
ImageNet Large Scale Visual Recognition Challenge
TLDR
The creation of this benchmark dataset and the advances in object recognition that have been possible as a result are described, and the state-of-the-art computer vision accuracy with human accuracy is compared. Expand
...
1
2
3
4
5
...