• Corpus ID: 4009713

Unsupervised Representation Learning by Predicting Image Rotations

@article{Gidaris2018UnsupervisedRL,
  title={Unsupervised Representation Learning by Predicting Image Rotations},
  author={Spyros Gidaris and Praveer Singh and Nikos Komodakis},
  journal={ArXiv},
  year={2018},
  volume={abs/1803.07728}
}
Over the last years, deep convolutional neural networks (ConvNets) have transformed the field of computer vision thanks to their unparalleled capacity to learn high level semantic image features. [...] Key Result We exhaustively evaluate our method in various unsupervised feature learning benchmarks and we exhibit in all of them state-of-the-art performance. Specifically, our results on those benchmarks demonstrate dramatic improvements w.r.t. prior state-of-the-art approaches in unsupervised representation…Expand
AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation
TLDR
This work proposes AugNet, a new deep learning training paradigm to learn image features from a collection of unlabeled pictures, and develops a method to construct the similarities between pictures as distance metrics in the embedding space by leveraging the intercorrelation between augmented versions of samples.
Surprising Effectiveness of Few-Image Unsupervised Feature Learning
TLDR
This paper provides an analysis for three different self-supervised feature learning methods vs number of training images and shows that it can top the accuracy for the first two convolutional layers of common networks using just a single unlabelled training image and obtain competitive results for other layers.
Image Representation Learning by Transformation Regression
TLDR
A regression model is designed to predict the continuous parameters of a group of transformations, i.e., image rotation, translation, and scaling, and it is found that with the proposed training mechanism as an initialization, the performance of the existing state-of-the-art classification deep architectures can be preferably improved.
Unsupervised learning of discriminative representation for image recognition
  • 2019
As for machine learning, computer vision has witnessed a core change with the recent repopularization of Deep Neural Networks (DNN) at the end of 2012 [6]. For the first time in several years, DNN
Unsupervised Learning of Dense Visual Representations
TLDR
View-Agnostic Dense Representation (VADeR) is proposed for unsupervised learning of dense representations of pixelwise representations by forcing local features to remain constant over different viewing conditions through pixel-level contrastive learning.
Leveraging Large-Scale Uncurated Data for Unsupervised Pre-training of Visual Features
TLDR
This work proposes a new unsupervised approach which leverages self-supervision and clustering to capture complementary statistics from large-scale data and validates its approach on 96 million images from YFCC100M, achieving state-of-the-art results among unsuper supervised methods on standard benchmarks.
Relative Order Analysis and Optimization for Unsupervised Deep Metric Learning
TLDR
The proposed relative orders for unsupervised learning (ROUL) method is able to significantly improve the performance ofunsupervised deep metric learning.
Self-supervised Learning with Fully Convolutional Networks
TLDR
A novel self-supervised learning framework is developed by formulating the Jigsaw Puzzle problem as a patch-wise classification process and solving it with a fully convolutional network to learn representation from unlabeled data for semantic segmentation.
Self-Supervised Spatiotemporal Feature Learning via Video Rotation Prediction.
TLDR
With the self-supervised pre-trained 3DRotNet from large datasets, the recognition accuracy is boosted up by 20.4% on UCF101 and 16.7% on HMDB51 respectively, compared to the models trained from scratch.
Laplacian Denoising Autoencoder
TLDR
This paper proposes to learn data representations with a novel type of denoising autoencoder, where the noisy input data is generated by corrupting latent clean data in the gradient domain, which can be naturally generalized to span multiple scales with a Laplacian pyramid representation of the input data.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 40 REFERENCES
Unsupervised Learning by Predicting Noise
TLDR
This paper introduces a generic framework to train deep networks, end-to-end, with no supervision, to fix a set of target representations, called Noise As Targets (NAT), and to constrain the deep features to align to them.
Unsupervised Visual Representation Learning by Context Prediction
TLDR
It is demonstrated that the feature representation learned using this within-image context indeed captures visual similarity across images and allows us to perform unsupervised visual discovery of objects like cats, people, and even birds from the Pascal VOC 2011 detection dataset.
Context Encoders: Feature Learning by Inpainting
TLDR
It is found that a context encoder learns a representation that captures not just appearance but also the semantics of visual structures, and can be used for semantic inpainting tasks, either stand-alone or as initialization for non-parametric methods.
Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles
TLDR
A novel unsupervised learning approach to build features suitable for object detection and classification and to facilitate the transfer of features to other tasks, the context-free network (CFN), a siamese-ennead convolutional neural network is introduced.
Data-dependent Initializations of Convolutional Neural Networks
TLDR
This work presents a fast and simple data-dependent initialization procedure, that sets the weights of a network such that all units in the network train at roughly the same rate, avoiding vanishing or exploding gradients.
Joint Unsupervised Learning of Deep Representations and Image Clusters
TLDR
A recurrent framework for joint unsupervised learning of deep representations and image clusters by integrating two processes into a single model with a unified weighted triplet loss function and optimizing it end-to-end can obtain not only more powerful representations, but also more precise image clusters.
Representation Learning by Learning to Count
TLDR
This paper uses two image transformations in the context of counting: scaling and tiling to train a neural network with a contrastive loss that produces representations that perform on par or exceed the state of the art in transfer learning benchmarks.
Learning Multiple Layers of Features from Tiny Images
TLDR
It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.
Learning Features by Watching Objects Move
TLDR
Inspired by the human visual system, low-level motion-based grouping cues can be used to learn an effective visual representation that significantly outperforms previous unsupervised approaches across multiple settings, especially when training data for the target task is scarce.
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
TLDR
This work introduces a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrates that they are a strong candidate for unsupervised learning.
...
1
2
3
4
...