PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction

@inproceedings{Ma2020PowerTransformerUC,
  title={PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction},
  author={Xinyao Ma and Maarten Sap and Hannah Rashkin and Yejin Choi},
  booktitle={EMNLP},
  year={2020}
}
Unconscious biases continue to be prevalent in modern text and media, calling for algorithms that can assist writers with bias correction. For example, a female character in a story is often portrayed as passive and powerless (“_She daydreams about being a doctor_”) while a man is portrayed as more proactive and powerful (“_He pursues his dream of being a doctor_”). We formulate **Controllable Debiasing**, a new revision task that aims to rewrite a given text to correct the implicit and… 

Figures and Tables from this paper

Analyzing the Limits of Self-Supervision in Handling Bias in Language
TLDR
This work defines and comprehensively evaluates how well language models capture the semantics of four tasks for bias: diagnosis, identification, extraction and rephrasing, and indicates that language models are capable of performing these tasks to widely varying degrees across different bias dimensions, such as gender and political affiliation.
Detect and Perturb: Neutral Rewriting of Biased and Sensitive Text via Gradient-based Decoding
TLDR
This work proposes a gradient-based rewriting framework, Detect and Perturb to Neutralize (DEPEN), that first detects sensitive components and masks them for regeneration, then perturbs the generation model at decoding time under a neutralizing constraint that pushes the distribution of sensitive attributes towards a uniform distribution.
Machine-in-the-Loop Rewriting for Creative Image Captioning
TLDR
A rewriting model is trained that, when prompted, modifies specified spans of text within the user’s original draft to introduce descriptive and figurative elements locally in the text to allow the user to retain control over the content.
Style Pooling: Automatic Text Style Obfuscation for Improved Classification Fairness
TLDR
A VAE-based framework that obfuscates stylistic features of human-generated text through style transfer by automatically re-writing the text itself is proposed, and its effectiveness in improving the fairness of downstream classifiers is demonstrated.
A Survey on Gender Bias in Natural Language Processing
TLDR
A survey of 304 papers onGender bias in natural language processing finds that research on gender bias suffers from four core limitations and sees overcoming these limitations as a necessary development in future research.
Controlled Text Generation as Continuous Optimization with Multiple Constraints
TLDR
This work forms the decoding process as an optimization problem which allows for multiple attributes to be easily incorporated as differentiable constraints to the optimization and makes use of Lagrangian multipliers and gradient-descent based techniques to generate the desired text.
Deep Transfer Learning & Beyond: Transformer Language Models in Information Systems Research
TLDR
A review of existing IS literature reveals that suboptimal text mining techniques are prevalent and that the more advanced TLMs could be applied to enhance and increase IS research involving text data, and to enable new IS research topics, thus creating more value for the research community.
Uncovering Implicit Gender Bias in Narratives through Commonsense Inference
TLDR
This work infer and analyze the protagonist’s motivations, attributes, mental states, and implications on others, and uses a commonsense reasoning engine to uncover implicit biases associated with the protagonist in model-generated stories.
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
TLDR
It is found that pretrained LMs can degenerate into toxic text even from seemingly innocuous prompts, and empirically assess several controllable generation methods find that while data- or compute-intensive methods are more effective at steering away from toxicity than simpler solutions, no current method is failsafe against neural toxic degeneration.
Modeling Social Media Narratives about Caste-related News Stories
TLDR
The overall goal of the work is to model social media narratives associated with caste-specific news stories and to tackle casteist social media posts using counter-narratives generated by leveraging on the inferred value judgments.
...
1
2
...

References

SHOWING 1-10 OF 58 REFERENCES
Automatically Neutralizing Subjective Bias in Text
TLDR
Large-scale human evaluation across four domains (encyclopedias, news headlines, books, and political speeches) suggests that these algorithms are a first step towards the automatic identification and reduction of bias.
Multiple-Attribute Text Rewriting
TLDR
This paper proposes a new model that controls several factors of variation in textual data where this condition on disentanglement is replaced with a simpler mechanism based on back-translation, and demonstrates that the fully entangled model produces better generations.
“Transforming” Delete, Retrieve, Generate Approach for Controlled Text Style Transfer
TLDR
This work introduces the Generative Style Transformer (GST) - a new approach to rewriting sentences to a target style in the absence of parallel style corpora, which outperform state-of-art systems across 5 datasets on sentiment, gender and political slant transfer.
The Woman Worked as a Babysitter: On Biases in Language Generation
TLDR
The notion of the regard towards a demographic is introduced, the varying levels of regard towards different demographics are used as a defining metric for bias in NLG, and the extent to which sentiment scores are a relevant proxy metric for regard is analyzed.
The Curious Case of Neural Text Degeneration
TLDR
By sampling text from the dynamic nucleus of the probability distribution, which allows for diversity while effectively truncating the less reliable tail of the distribution, the resulting text better demonstrates the quality of human text, yielding enhanced diversity without sacrificing fluency and coherence.
Unsupervised Text Style Transfer using Language Models as Discriminators
TLDR
This paper proposes a new technique that uses a target domain language model as the discriminator, providing richer and more stable token-level feedback during the learning process, and shows that this approach leads to improved performance on three tasks: word substitution decipherment, sentiment modification, and related language translation.
Joint Copying and Restricted Generation for Paraphrase
TLDR
A novel Seq2Seq model to fuse a copying decoder and a restricted generative decoder that outperforms the state-of-the-art approaches in terms of both informativeness and language quality.
Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer
TLDR
This paper proposes simpler methods motivated by the observation that text attributes are often marked by distinctive phrases, and the strongest method extracts content words by deleting phrases associated with the sentence’s original attribute value, retrieves new phrases associatedwith the target attribute, and uses a neural model to fluently combine these into a final output.
Plug and Play Language Models: A Simple Approach to Controlled Text Generation
TLDR
The Plug and Play Language Model (PPLM) for controllable language generation is proposed, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM.
A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories
TLDR
A new framework for evaluating story understanding and script learning: the `Story Cloze Test’, which requires a system to choose the correct ending to a four-sentence story, and a new corpus of 50k five- Sentence commonsense stories, ROCStories, to enable this evaluation.
...
1
2
3
4
5
...