Self-Supervised Feature Learning by Learning to Spot Artifacts

@article{Jenni2018SelfSupervisedFL,
  title={Self-Supervised Feature Learning by Learning to Spot Artifacts},
  author={S. Jenni and Paolo Favaro},
  journal={2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2018},
  pages={2733-2742}
}
  • S. Jenni, P. Favaro
  • Published 1 June 2018
  • Computer Science
  • 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
We introduce a novel self-supervised learning method based on adversarial training. [] Key Method To generate images with artifacts, we pre-train a high-capacity autoencoder and then we use a damage and repair strategy: First, we freeze the autoencoder and damage the output of the encoder by randomly dropping its entries. Second, we augment the decoder with a repair network, and train it in an adversarial manner against the discriminator.
Cross Pixel Optical Flow Similarity for Self-Supervised Learning
TLDR
This work uses motion cues in the form of optical flow, to supervise representations of static images, and achieves state-of-the-art results in self-supervision using motion cues, competitive results for self- supervision in general, and is overall state of the art inSelf-supervised pretraining for semantic image segmentation.
Self-supervised Learning with Fully Convolutional Networks
TLDR
A novel self-supervised learning framework is developed by formulating the Jigsaw Puzzle problem as a patch-wise classification process and solving it with a fully convolutional network to learn representation from unlabeled data for semantic segmentation.
A critical analysis of self-supervision, or what we can learn from a single image
TLDR
It is shown that three different and representative methods, BiGAN, RotNet and DeepCluster, can learn the first few layers of a convolutional network from a single image as well as using millions of images and manual labels, provided that strong data augmentation is used.
Self-Supervised Representation Learning via Neighborhood-Relational Encoding
TLDR
A novel self-supervised representation learning by taking advantage of a neighborhood-relational encoding (NRE) among the training data and integrating an encoder-decoder structure for learning to represent samples considering their local neighborhood information.
Self-Supervised Representation Learning by Rotation Feature Decoupling
  • Zeyu Feng, Chang Xu, D. Tao
  • Computer Science
    2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
TLDR
A self-supervised learning method that focuses on beneficial properties of representation and their abilities in generalizing to real-world tasks and decouples the rotation discrimination from instance discrimination, which allows it to improve the rotation prediction by mitigating the influence of rotation label noise.
CompRess: Self-Supervised Learning by Compressing Representations
TLDR
This work develops a model compression method to compress an already learned, deep self-supervised model (teacher) to a smaller one (student), which outperforms all previous methods including the fully supervised model on ImageNet linear evaluation and on nearest neighbor evaluation.
Unsupervised Pre-Training of Image Features on Non-Curated Data
TLDR
This work proposes a new unsupervised approach which leverages self-supervision and clustering to capture complementary statistics from large-scale data and validates its approach on 96 million images from YFCC100M, achieving state-of-the-art results among unsuper supervised methods on standard benchmarks.
Learning Object Representations by Mixing Scenes Master Thesis
TLDR
This thesis proposes a novel approach for unsupervised learning of object representations by mixing natural image scenes using adversarial training and demonstrates the potential that lies in learning representations directly from natural image data and reinforces it as a promising avenue for future research.
Leveraging Large-Scale Uncurated Data for Unsupervised Pre-training of Visual Features
TLDR
This work proposes a new unsupervised approach which leverages self-supervision and clustering to capture complementary statistics from large-scale data and validates its approach on 96 million images from YFCC100M, achieving state-of-the-art results among unsuper supervised methods on standard benchmarks.
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
TLDR
This work extensively study and validate the model performance on over 50 benchmarks including fairness, robustness to distribution shift, geographical diversity, fine grained recognition, image copy detection and many image classification datasets, and discovers that such model is more robust, more fair, less harmful and less biased than supervised models or models trained on objectcentric datasets such as ImageNet.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 49 REFERENCES
Improved Techniques for Training GANs
TLDR
This work focuses on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic, and presents ImageNet samples with unprecedented resolution and shows that the methods enable the model to learn recognizable features of ImageNet classes.
Learning from Simulated and Unsupervised Images through Adversarial Training
TLDR
This work develops a method for S+U learning that uses an adversarial network similar to Generative Adversarial Networks (GANs), but with synthetic images as inputs instead of random vectors, and makes several key modifications to the standard GAN algorithm to preserve annotations, avoid artifacts, and stabilize training.
Semi-Supervised Learning with Context-Conditional Generative Adversarial Networks
TLDR
A simple semi-supervised learning approach for images based on in-painting using an adversarial loss is introduced, able to directly train large VGG-style networks in a semi- supervised fashion.
Context Encoders: Feature Learning by Inpainting
TLDR
It is found that a context encoder learns a representation that captures not just appearance but also the semantics of visual structures, and can be used for semantic inpainting tasks, either stand-alone or as initialization for non-parametric methods.
Unsupervised Learning by Predicting Noise
TLDR
This paper introduces a generic framework to train deep networks, end-to-end, with no supervision, to fix a set of target representations, called Noise As Targets (NAT), and to constrain the deep features to align to them.
Colorization as a Proxy Task for Visual Understanding
TLDR
This work investigates and improves self-supervision as a drop-in replacement for ImageNet pretraining, focusing on automatic colorization as the proxy task, and presents the first in-depth analysis of self- supervision via colorization, concluding that formulation of the loss, training details and network architecture play important roles in its effectiveness.
Adversarial Autoencoders
TLDR
This paper shows how the adversarial autoencoder can be used in applications such as semi-supervised classification, disentangling style and content of images, unsupervised clustering, dimensionality reduction and data visualization, and performed experiments on MNIST, Street View House Numbers and Toronto Face datasets.
Shuffle and Learn: Unsupervised Learning Using Temporal Order Verification
TLDR
This paper forms an approach for learning a visual representation from the raw spatiotemporal signals in videos using a Convolutional Neural Network, and shows that this method captures information that is temporally varying, such as human pose.
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
TLDR
This work introduces a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrates that they are a strong candidate for unsupervised learning.
Data-dependent Initializations of Convolutional Neural Networks
TLDR
This work presents a fast and simple data-dependent initialization procedure, that sets the weights of a network such that all units in the network train at roughly the same rate, avoiding vanishing or exploding gradients.
...
1
2
3
4
5
...