• Corpus ID: 232135355

VIPriors 1: Visual Inductive Priors for Data-Efficient Deep Learning Challenges

  title={VIPriors 1: Visual Inductive Priors for Data-Efficient Deep Learning Challenges},
  author={Robert-Jan Bruintjes and Attila Lengyel and Marcos Baptista R{\'i}os and Osman Semih Kayhan and Jan C. van Gemert},
We present the first edition of "VIPriors: Visual Inductive Priors for Data-Efficient Deep Learning" challenges. We offer four data-impaired challenges, where models are trained from scratch, and we reduce the number of training samples to a fraction of the full set. Furthermore, to encourage data efficient solutions, we prohibited the use of pre-trained models and other transfer learning techniques. The majority of top ranking solutions make heavy use of data augmentation, model ensembling… 

Figures and Tables from this paper

A Strong Baseline for the VIPriors Data-Efficient Image Classification Challenge

This work presents a strong baseline for data-efficient image classification on the VIPriors challenge dataset, which is a sub-sampled version of ImageNet-1k with 100 images per class, and achieves 69.7% accuracy.

2nd Place Solution for ICCV 2021 VIPriors Image Classification Challenge: An Attract-and-Repulse Learning Approach

A new framework, termed Attractand-Repulse, which consists of Contrastive Regularization to enrich the feature representations, Symmetric Cross Entropy to balance the fitting for different classes and Mean Teacher to calibrate label information is proposed and achieves the second place in ICCV 2021 VIPriors Image Classification Challenge.

Tune It or Don’t Use It: Benchmarking Data-Efficient Image Classification

Surprisingly, it is found that tuning learning rate, weight decay, and batch size on a separate validation split results in a highly competitive baseline, which outperforms all but one specialized method and performs competitively to the remaining one.

Equivariance and Invariance Inductive Bias for Learning from Insufficient Data

State-of-the-art experimental results on real-world benchmarks (VIPriors, ImageNet100 and NICO) validate the great potential of equivariance and invariance in data-efficient learning.

TCLR: Temporal Contrastive Learning for Video Representation

DeepSportradar-v1: Computer Vision Dataset for Sports Understanding with High Quality Annotations

DeepSportradar-v1, a suite of computer vision tasks, datasets and benchmarks for automated sport understanding, is introduced to close the gap between academic research and real world settings.

Computer Vision and Image Understanding TCLR: Temporal contrastive learning for video representation

A new temporal contrastive learning framework consisting of two novel losses to improve upon existing contrastive self-supervised video representation learning methods, with significant improvement over the state-of-the-art results in various downstream video understanding tasks such as action recognition, limited-label action classification, and nearest-neighbor video retrieval on multiple video datasets and backbones.

Genetic Programming-Based Evolutionary Deep Learning for Data-Efficient Image Classification

A genetic programming-based evolutionary deep learning approach that can automatically evolve variable-length models using many important operators from both image and classification domains and achieves better performance in most cases than deep learning methods.

Image Classification with Small Datasets: Overview and Benchmark

This article systematically organize and connect past studies to consolidate a community that is currently fragmented and scattered and proposes a common benchmark that allows for an objective comparison of approaches.

WikiChurches: A Fine-Grained Dataset of Architectural Styles with Real-World Challenges

A novel dataset for architectural style classification, consisting of 9,485 images of church buildings, and provides 631 bounding box annotations of characteristic visual features for 139 churches from four major categories, to serve as a benchmark for various research fields.



Distilling Visual Priors from Self-Supervised Learning

A novel two-phase pipeline that leverages self-supervised learning and knowledge distillation to improve the generalization ability of CNN models for image classification under the data-deficient setting is presented.

1st Visual Inductive Priors for Data-Efficient Deep Learning workshop at ECCV 2020: semantic segmentation Challenge Track Technical Report: Multi-level tail pixel cutmix and scale attention for long-tailed scene parsing

A new multi-level tail pixel cutmix which will cut regions according to the tail pixel distribution for improving the long-tailed Scene Parsing and applies a scale attention model with multi scale augmentation on these images to get a more effective representation on long-tails pixel distribution.

A Technical Report for VIPriors Image Classification Challenge

In this challenge, the difficulty is how to train the model from scratch without any pretrained weight, so several strong backbones and multiple loss functions are used to learn more representative features.

Data-Efficient Deep Learning Method for Image Classification Using Data Augmentation, Focal Cosine Loss, and Ensemble

This work applied some techniques in three aspects: data, loss function, and prediction to enable training from scratch with less data to obtain high accuracy by leveraging ImageNet data which consist of only 50 images per class.

Data-efficient semantic segmentation via extremely perturbed data augmentation

A new variation of strong augmentation CutMix Progressive Sprinkles is presented, presenting improved results over the authors' Baseline, and how to tune the hyper-parameters of these advanced augmentations for the area of scene understanding is investigated.

A Visual Inductive Priors Framework for Data-Efficient Image Classification

This work proposes a novel neural network architecture: DSK-net, which is very effiective in training from small data sets, and wins the 1st Place in VIPriors image classification competition.

ResNeSt: Split-Attention Networks

This work designs a new variant of the ResNet model, named ResNeSt, which outperforms EfficientNet in terms of the accuracy/latency trade-off and applies channel-wise attention across different network branches to leverage the complementary strengths of both feature-map attention and multi-path representation.

Improved Baselines with Momentum Contrastive Learning

With simple modifications to MoCo, this note establishes stronger baselines that outperform SimCLR and do not require large training batches, and hopes this will make state-of-the-art unsupervised learning research more accessible.

EfficientSeg: An Efficient Semantic Segmentation Network

EfficientSeg architecture, a modified and scalable version of U-Net, which can be efficiently trained despite its depth is introduced and outperformed U- net baseline score and got the fourth place in semantic segmentation track of ECCV 2020 VIPriors challenge.

Revisiting Unreasonable Effectiveness of Data in Deep Learning Era

It is found that the performance on vision tasks increases logarithmically based on volume of training data size, and it is shown that representation learning (or pre-training) still holds a lot of promise.