“You Are Grounded!”: Latent Name Artifacts in Pre-trained Language Models

@inproceedings{Shwartz2020YouAG,
  title={“You Are Grounded!”: Latent Name Artifacts in Pre-trained Language Models},
  author={Vered Shwartz and Rachel Rudinger and Oyvind Tafjord},
  booktitle={EMNLP},
  year={2020}
}
Pre-trained language models (LMs) may perpetuate biases originating in their training corpus to downstream models. We focus on artifacts associated with the representation of given names (e.g., Donald), which, depending on the corpus, may be associated with specific entities, as indicated by next token prediction (e.g., Trump). While helpful in some contexts, grounding happens also in under-specified or inappropriate contexts. For example, endings generated for `Donald is a' substantially… Expand
Societal Biases in Language Generation: Progress and Challenges
Do Neural Language Models Overcome Reporting Bias?
Gender and Representation Bias in GPT-3 Generated Stories
Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases
Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data
Towards Controllable Biases in Language Generation

References

SHOWING 1-10 OF 44 REFERENCES
Language Models as Knowledge Bases?
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
The Woman Worked as a Babysitter: On Biases in Language Generation
On Measuring Social Biases in Sentence Encoders
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
...
1
2
3
4
5
...