@inproceedings{Hambardzumyan2021WARPWA,
author={Karen Hambardzumyan and H. Khachatrian and Jonathan May},
booktitle={ACL},
year={2021}
}
• Published in ACL 1 January 2021
• Computer Science
Transfer learning from pretrained language models recently became the dominant approach for solving many NLP tasks. A common approach to transfer learning for multiple tasks that maximize parameter sharing trains one or more task-specific layers on top of the language model. In this paper, we present an alternative approach based on adversarial reprogramming, which extends earlier work on automatic prompt generation. Adversarial reprogramming attempts to learn task-specific word embeddings that…
58 Citations

Figures and Tables from this paper

Making Pre-trained Language Models End-to-end Few-shot Learners with Contrastive Prompt Tuning
• Computer Science
ArXiv
• 2022
CP-Tuning is presented, the first end-to-end Contrastive Prompt Tuning framework for fine-tuning PLMs without any manual engineering of task-specific prompts and verbalizers, and it is integrated with the task-invariantcontinuous prompt encoding technique with fully trainable prompt parameters.
Contrastive Demonstration Tuning for Pre-trained Language Models
• Computer Science
ArXiv
• 2022
Experimental results illustrate that the proposed novel pluggable, ex-tensible, and efﬁcient approach named contrastive demonstration tuning, which is free of demonstration sampling, integrated with previous approaches LM-BFF and P-tuning can yield better performance.
Prototypical Verbalizer for Prompt-based Few-shot Tuning
• Computer Science
ACL
• 2022
This work proposes the prototypical verbalizer (ProtoVerb) which is built directly from training data and demonstrates that ProtoVerb significantly outperforms current automatic verbalizers, especially when training data is extremely scarce.
Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
• Computer Science, Linguistics
ArXiv
• 2022
A novel model that uses a unified prompt for all languages, called UniPrompt, which is model-based and languageagnostic, and can significantly outperform the strong baselines across different languages.
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey
• Computer Science
ArXiv
• 2021
A survey of recent work that uses large, pre-trained transformer-based language models to solve NLP tasks via pre-training then fine-tuning, prompting, or text generation approaches.
The Power of Scale for Parameter-Efficient Prompt Tuning
• Computer Science
EMNLP
• 2021
This work explores “prompt tuning”, a simple yet effective mechanism for learning “soft prompts” to condition frozen language models to perform specific downstream tasks, and shows that conditioning a frozen model with soft prompts confers benefits in robustness to domain transfer, as compared to full model tuning.
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning
• Computer Science
NeurIPS
• 2021
An analysis framework is proposed that links the pretraining and downstream tasks with an underlying latent variable generative model of text — the downstream classiﬁer must recover a function of the posterior distribution over the latent variables to recover downstream guarantees with weaker non-degeneracy conditions.
$\mathcal{Y}$-Tuning: An Efficient Tuning Paradigm for Large-Scale Pre-Trained Models via Label Representation Learning
• Computer Science
• 2022
Y-Tuning is proposed, an efficient yet effective paradigm to adapt frozen large-scale PTMs to specific downstream tasks and achieves performance more than 96% of full fine-tuning on GLUE Benchmark with only 2% tunable parameters and much fewer training costs.
• Computer Science
FINDINGS
• 2022
This work proposes an answer space clustered prompting model (ASCM) together with a synonym initialization method (SI) which automatically categorizes all answer tokens in a semantic-clustered embedding space and proposes a stable semi-supervised method named stair learning (SL) that orderly distills knowledge from better models to weaker models.
Black-Box Tuning for Language-Model-as-a-Service
• Computer Science
ArXiv
• 2022
The experimental results show that the black-box tuning with RoBERTa on a few labeled samples not only outperforms manual prompt and GPT-3’s in-context learning, but also surpasses the gradient-based counterparts, i.e., prompt tuning and full model tuning.

References

SHOWING 1-10 OF 40 REFERENCES
AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts
• Proceedings of the 2020 Conference on Empirical Meth-
• 2020
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
• Computer Science
EMNLP
• 2013
A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network.
It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
• Computer Science
NAACL
• 2021
This work shows that performance similar to GPT-3 can be obtained with language models that are much “greener” in that their parameter count is several orders of magnitude smaller, and identifies key factors required for successful natural language understanding with small language models.
• Computer Science
ICLR
• 2019
This paper demonstrates adversarial reprogramming on six ImageNet classification models, repurposing these models to perform a counting task, as well as classification tasks: classification of MNIST and CIFAR-10 examples presented as inputs to the ImageNet model.
RoBERTa: A Robustly Optimized BERT Pretraining Approach
• Computer Science
ArXiv
• 2019
It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.
The Sixth PASCAL Recognizing Textual Entailment Challenge
• Computer Science
TAC
• 2009
This paper presents the Sixth Recognizing Textual Entailment (RTE-6) challenge, as the traditional Main Task was replaced by a new task, similar to the RTE-5 Search Pilot, in which TextualEntailment is performed on a real corpus in the Update Summarization scenario.