• Corpus ID: 245144648

Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

  title={Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases},
  author={Shrimai Prabhumoye and Rafal Kocielnik and Mohammad Shoeybi and Anima Anandkumar and Bryan Catanzaro},
Warning: this paper contains content that may be offensive or upsetting. Detecting social bias in text is challenging due to nuance, subjectivity, and difficulty in ob-taining good quality labeled datasets at scale, especially given the evolving nature of social biases and society. To address these challenges, we propose a few-shot instruction-based method for prompting pre-trained language models (LMs). We select a few class-balanced exemplars from a small support repository that are closest to… 
Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models
Coupling off-the-shelf “small” language models with the prompt- and-reranking method enables us to perform arbitrary textual style transfer without any model training or prompt-tuning.
Toxicity Detection with Generative Prompt-based Inference
This work explores the generative variant of zero-shot prompt-based toxicity detection with comprehensive trials on prompt engineering and highlights the strengths of its generative classification approach both quantitatively and qualitatively.
COLD: A Benchmark for Chinese Offensive Language Detection
The proposed COLDETECTOR is used to help detoxify the Chinese online communities and evaluate the safety performance of generative language models, and it is found that CPM tends to generate more offensive output than CDialGPT, and specific prompts can trigger offensiveness outputs more easily.
Predictability and Surprise in Large Generative Models
This paper highlights a counterintuitive property of large-scale generative models, which have a paradoxical combination of predictable loss on a broad training distribution, and unpredictable specific capabilities, inputs, and outputs, and analyzed how these conflicting properties combine to give model developers various motivations for deploying these models, and challenges that can hinder deployment.


Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
This paper demonstrates a surprising finding: Pretrained language models recognize, to a considerable degree, their undesirable biases and the toxicity of the content they produce and proposes a decoding algorithm that reduces the probability of a language model producing problematic text, known as self-debiasing.
Making Pre-trained Language Models Better Few-shot Learners
The LM-BFF approach makes minimal assumptions on task resources and domain expertise, and hence constitutes a strong task-agnostic method for few-shot learning.
SOLID: A Large-Scale Semi-Supervised Dataset for Offensive Language Identification
This work creates the largest available dataset for this task, SOLID, which contains over nine million English tweets labeled in a semi-supervised manner, and demonstrates experimentally that using SOLID along with OLID yields improved performance on the OLID test set for two different models, especially for the lower levels of the taxonomy.
Language Models are Few-Shot Learners
GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.
Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models
This work shows that finetuning LMs in the few-shot setting can considerably reduce the need for prompt engineering, and recommends finetuned LMs for few- shot learning as it is more accurate, robust to different prompts, and can be made nearly as efficient as using frozen LMs.
Deeper Attention to Abusive User Content Moderation
A novel, deep, classificationspecific attention mechanism improves the performance of the RNN further, and can also highlight suspicious words for free, without including highlighted words in the training data.
Factual Probing Is [MASK]: Learning vs. Learning to Recall
OptiPrompt is proposed, a novel and efficient method which directly optimizes in continuous embedding space and is able to predict an additional 6.4% of facts in the LAMA benchmark.
Language Models are Unsupervised Multitask Learners
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
Abusive Language Detection in Online User Content
A machine learning based method to detect hate speech on online user comments from two domains which outperforms a state-of-the-art deep learning approach and a corpus of user comments annotated for abusive language, the first of its kind.
Adversarial NLI: A New Benchmark for Natural Language Understanding
This work introduces a new large-scale NLI benchmark dataset, collected via an iterative, adversarial human-and-model-in-the-loop procedure, and shows that non-expert annotators are successful at finding their weaknesses.