Distance Metric Learning Loss Functions in Few-Shot Scenarios of Supervised Language Models Fine-Tuning
@article{Sosnowski2022DistanceML, title={Distance Metric Learning Loss Functions in Few-Shot Scenarios of Supervised Language Models Fine-Tuning}, author={Witold Sosnowski and Karolina Seweryn and Anna Wr'oblewska and Piotr Gawrysiak}, journal={ArXiv}, year={2022}, volume={abs/2211.15195} }
,
References
SHOWING 1-10 OF 27 REFERENCES
Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks
- Computer ScienceCOLING
- 2020
LEOPARD is trained with the state-of-the-art transformer architecture and shows better generalization to tasks not seen at all during training, with as few as 4 examples per label, than self-supervised pre-training or multi-task training.
Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning
- Computer ScienceICLR
- 2021
This work proposes a supervised contrastive learning (SCL) objective for the fine-tuning stage of natural language understanding classification models and demonstrates that the new objective leads to models that are more robust to different levels of noise in the training data, and can generalize better to related tasks with limited labeled task data.
Diversity With Cooperation: Ensemble Methods for Few-Shot Classification
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This work shows that by addressing the fundamental high-variance issue of few-shot learning classifiers, it is possible to significantly outperform current meta-learning techniques.
Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss
- Computer ScienceNeurIPS
- 2019
A theoretically-principled label-distribution-aware margin (LDAM) loss motivated by minimizing a margin-based generalization bound is proposed that replaces the standard cross-entropy objective during training and can be applied with prior strategies for training with class-imbalance such as re-weighting or re-sampling.
SoftTriple Loss: Deep Metric Learning Without Triplet Sampling
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
The SoftTriple loss is proposed to extend the SoftMax loss with multiple centers for each class, equivalent to a smoothed triplet loss where each class has a single center.
Supervised Contrastive Learning
- Computer ScienceNeurIPS
- 2020
A novel training methodology that consistently outperforms cross entropy on supervised learning tasks across different architectures and data augmentations is proposed, and the batch contrastive loss is modified, which has recently been shown to be very effective at learning powerful representations in the self-supervised setting.
Regularizing Neural Networks by Penalizing Confident Output Distributions
- Computer ScienceICLR
- 2017
It is found that both label smoothing and the confidence penalty improve state-of-the-art models across benchmarks without modifying existing hyperparameters, suggesting the wide applicability of these regularizers.
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
- Computer ScienceEMNLP
- 2013
A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Computer ScienceNAACL
- 2019
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
SentEval: An Evaluation Toolkit for Universal Sentence Representations
- Computer ScienceLREC
- 2018
We introduce SentEval, a toolkit for evaluating the quality of universal sentence representations. SentEval encompasses a variety of tasks, including binary and multi-class classification, natural…