VIPriors 1: Visual Inductive Priors for Data-Efficient Deep Learning Challenges
@article{Bruintjes2021VIPriors1V, title={VIPriors 1: Visual Inductive Priors for Data-Efficient Deep Learning Challenges}, author={Robert-Jan Bruintjes and Attila Lengyel and Marcos Baptista R{\'i}os and Osman Semih Kayhan and Jan C. van Gemert}, journal={ArXiv}, year={2021}, volume={abs/2201.08625} }
We present the first edition of "VIPriors: Visual Inductive Priors for Data-Efficient Deep Learning" challenges. We offer four data-impaired challenges, where models are trained from scratch, and we reduce the number of training samples to a fraction of the full set. Furthermore, to encourage data efficient solutions, we prohibited the use of pre-trained models and other transfer learning techniques. The majority of top ranking solutions make heavy use of data augmentation, model ensembling…
11 Citations
A Strong Baseline for the VIPriors Data-Efficient Image Classification Challenge
- Computer Science, Environmental ScienceArXiv
- 2021
This work presents a strong baseline for data-efficient image classification on the VIPriors challenge dataset, which is a sub-sampled version of ImageNet-1k with 100 images per class, and achieves 69.7% accuracy.
2nd Place Solution for ICCV 2021 VIPriors Image Classification Challenge: An Attract-and-Repulse Learning Approach
- Computer ScienceArXiv
- 2022
A new framework, termed Attractand-Repulse, which consists of Contrastive Regularization to enrich the feature representations, Symmetric Cross Entropy to balance the fitting for different classes and Mean Teacher to calibrate label information is proposed and achieves the second place in ICCV 2021 VIPriors Image Classification Challenge.
Tune It or Don’t Use It: Benchmarking Data-Efficient Image Classification
- Computer Science, Environmental Science2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
- 2021
Surprisingly, it is found that tuning learning rate, weight decay, and batch size on a separate validation split results in a highly competitive baseline, which outperforms all but one specialized method and performs competitively to the remaining one.
Equivariance and Invariance Inductive Bias for Learning from Insufficient Data
- Computer ScienceECCV
- 2022
State-of-the-art experimental results on real-world benchmarks (VIPriors, ImageNet100 and NICO) validate the great potential of equivariance and invariance in data-efficient learning.
TCLR: Temporal Contrastive Learning for Video Representation
- Computer ScienceComput. Vis. Image Underst.
- 2022
DeepSportradar-v1: Computer Vision Dataset for Sports Understanding with High Quality Annotations
- Computer ScienceMMSports@MM
- 2022
DeepSportradar-v1, a suite of computer vision tasks, datasets and benchmarks for automated sport understanding, is introduced to close the gap between academic research and real world settings.
Computer Vision and Image Understanding TCLR: Temporal contrastive learning for video representation
- Computer Science
- 2022
A new temporal contrastive learning framework consisting of two novel losses to improve upon existing contrastive self-supervised video representation learning methods, with significant improvement over the state-of-the-art results in various downstream video understanding tasks such as action recognition, limited-label action classification, and nearest-neighbor video retrieval on multiple video datasets and backbones.
Genetic Programming-Based Evolutionary Deep Learning for Data-Efficient Image Classification
- Computer ScienceIEEE Transactions on Evolutionary Computation
- 2022
A genetic programming-based evolutionary deep learning approach that can automatically evolve variable-length models using many important operators from both image and classification domains and achieves better performance in most cases than deep learning methods.
Image Classification with Small Datasets: Overview and Benchmark
- Computer ScienceIEEE Access
- 2022
This article systematically organize and connect past studies to consolidate a community that is currently fragmented and scattered and proposes a common benchmark that allows for an objective comparison of approaches.
WikiChurches: A Fine-Grained Dataset of Architectural Styles with Real-World Challenges
- Computer ScienceNeurIPS Datasets and Benchmarks
- 2021
A novel dataset for architectural style classification, consisting of 9,485 images of church buildings, and provides 631 bounding box annotations of characteristic visual features for 139 churches from four major categories, to serve as a benchmark for various research fields.
References
SHOWING 1-10 OF 118 REFERENCES
Distilling Visual Priors from Self-Supervised Learning
- Computer ScienceECCV Workshops
- 2020
A novel two-phase pipeline that leverages self-supervised learning and knowledge distillation to improve the generalization ability of CNN models for image classification under the data-deficient setting is presented.
1st Visual Inductive Priors for Data-Efficient Deep Learning workshop at ECCV 2020: semantic segmentation Challenge Track Technical Report: Multi-level tail pixel cutmix and scale attention for long-tailed scene parsing
- Environmental Science, Computer Science
- 2020
A new multi-level tail pixel cutmix which will cut regions according to the tail pixel distribution for improving the long-tailed Scene Parsing and applies a scale attention model with multi scale augmentation on these images to get a more effective representation on long-tails pixel distribution.
A Technical Report for VIPriors Image Classification Challenge
- Computer ScienceArXiv
- 2020
In this challenge, the difficulty is how to train the model from scratch without any pretrained weight, so several strong backbones and multiple loss functions are used to learn more representative features.
Data-Efficient Deep Learning Method for Image Classification Using Data Augmentation, Focal Cosine Loss, and Ensemble
- Computer ScienceArXiv
- 2020
This work applied some techniques in three aspects: data, loss function, and prediction to enable training from scratch with less data to obtain high accuracy by leveraging ImageNet data which consist of only 50 images per class.
Data-efficient semantic segmentation via extremely perturbed data augmentation
- Computer Science
- 2020
A new variation of strong augmentation CutMix Progressive Sprinkles is presented, presenting improved results over the authors' Baseline, and how to tune the hyper-parameters of these advanced augmentations for the area of scene understanding is investigated.
A Visual Inductive Priors Framework for Data-Efficient Image Classification
- Computer ScienceECCV Workshops
- 2020
This work proposes a novel neural network architecture: DSK-net, which is very effiective in training from small data sets, and wins the 1st Place in VIPriors image classification competition.
ResNeSt: Split-Attention Networks
- Computer Science2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
- 2022
This work designs a new variant of the ResNet model, named ResNeSt, which outperforms EfficientNet in terms of the accuracy/latency trade-off and applies channel-wise attention across different network branches to leverage the complementary strengths of both feature-map attention and multi-path representation.
Improved Baselines with Momentum Contrastive Learning
- Computer ScienceArXiv
- 2020
With simple modifications to MoCo, this note establishes stronger baselines that outperform SimCLR and do not require large training batches, and hopes this will make state-of-the-art unsupervised learning research more accessible.
EfficientSeg: An Efficient Semantic Segmentation Network
- Computer ScienceArXiv
- 2020
EfficientSeg architecture, a modified and scalable version of U-Net, which can be efficiently trained despite its depth is introduced and outperformed U- net baseline score and got the fourth place in semantic segmentation track of ECCV 2020 VIPriors challenge.
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
- Computer Science2017 IEEE International Conference on Computer Vision (ICCV)
- 2017
It is found that the performance on vision tasks increases logarithmically based on volume of training data size, and it is shown that representation learning (or pre-training) still holds a lot of promise.