• Corpus ID: 8317437

LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

  title={LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop},
  author={Fisher Yu and Yinda Zhang and Shuran Song and Ari Seff and Jianxiong Xiao},
While there has been remarkable progress in the performance of visual recognition algorithms, the state-of-the-art models tend to be exceptionally data-hungry. Large labeled training datasets, expensive and tedious to produce, are required to optimize millions of parameters in deep network models. Lagging behind the growth in model capacity, the available datasets are quickly becoming outdated in terms of size and density. To circumvent this bottleneck, we propose to amplify human effort… 

Figures and Tables from this paper

Image Generation From Small Datasets via Batch Statistics Adaptation
This work proposes a new method for transferring prior knowledge of the pre-trained generator, which is trained with a large dataset, to a small dataset in a different domain, and can generate higher quality images compared to previous methods without collapsing.
Learning High-Resolution Domain-Specific Representations with a GAN Generator
This work considers the semi-supervised learning scenario when a small amount of labeled data is available along with a large unlabeled dataset from the same domain and finds that the use of LayerMatch-pretrained backbone leads to superior accuracy compared to standard supervised pretraining on ImageNet.
PandA: Unsupervised Learning of Parts and Appearances in the Feature Maps of GANs
An architecture-agnostic approach that jointly discovers factors representing spatial parts and their appearances in an entirely unsupervised fashion that enables context-aware local image editing with pixel-level control and provides much more accurate localized control.
Opening Deep Neural Networks With Generative Models
A thorough evaluation of the proposed GeMOS method in comparison with state-of-the-art open set algorithms is conducted, finding that geMOS either outperforms or is statistically indistinguishable from more complex and costly models.
Improved Techniques for Training Single-Image GANs
This work conducts a number of experiments to understand the challenges of training generative models from a single image, and proposes some best practices that allowed it to generate improved results over previous work.
Ensembling Off-the-shelf Models for GAN Training
An effective selection mechanism is proposed, by probing the linear separability between real and fake samples in pretrained model embed-dings, choosing the most accurate model, and progressively adding it to the discriminator ensemble, which can improve GAN training in both limited data and large-scale settings.
CNN-Generated Images Are Surprisingly Easy to Spot… for Now
It is demonstrated that, with careful pre- and post-processing and data augmentation, a standard image classifier trained on only one specific CNN generator (ProGAN) is able to generalize surprisingly well to unseen architectures, datasets, and training methods.
Super-Resolution via Deep Learning
Efficient Feature Transformations for Discriminative and Generative Continual Learning
This work proposes a simple task-specific feature map transformation strategy for continual learning, which it calls Efficient Feature Transformations (EFTs), which provide powerful flexibility for learning new tasks, achieved with minimal parameters added to the base architecture.
CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features
Patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches, and CutMix consistently outperforms state-of-the-art augmentation strategies on CIFAR and ImageNet classification tasks, as well as on ImageNet weakly-supervised localization task.


Very Deep Convolutional Networks for Large-Scale Image Recognition
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Going deeper with convolutions
We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition
Deep neural networks are easily fooled: High confidence predictions for unrecognizable images
This work takes convolutional neural networks trained to perform well on either the ImageNet or MNIST datasets and finds images with evolutionary algorithms or gradient ascent that DNNs label with high confidence as belonging to each dataset class, and produces fooling images, which are then used to raise questions about the generality of DNN computer vision.
Towards Scalable Dataset Construction: An Active Learning Approach
This work presents a discriminative learning process which employs active, online learning to quickly classify many images with minimal user input, and demonstrates precision which is often superior to the state-of-the-art, with scalability which exceeds previous work.
Multi-Level Active Prediction of Useful Image Annotations for Recognition
This work proposes to allow the category-learner to strategically choose what annotations it receives—based on both the expected reduction in uncertainty as well as the relative costs of obtaining each annotation—to learn more accurate category models with a lower total expenditure of manual annotation effort.
ImageNet classification with deep convolutional neural networks
A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.
Unbiased look at dataset bias
A comparison study using a set of popular datasets, evaluated based on a number of criteria including: relative data bias, cross-dataset generalization, effects of closed-world assumption, and sample value is presented.
Multiclass recognition and part localization with humans in the loop
A visual recognition system that is designed for fine-grained visual categorization that leveraging computer vision and analyzing the user responses achieves a significant average reduction in human effort over previous methods.