Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection
- Shauli Ravfogel, Yanai Elazar, Hila Gonen, Michael Twiton, Yoav Goldberg
- Computer ScienceAnnual Meeting of the Association for…
- 16 April 2020
This work presents Iterative Null-space Projection (INLP), a novel method for removing information from neural representations based on repeated training of linear classifiers that predict a certain property the authors aim to remove, followed by projection of the representations on their null-space.
oLMpics-On What Language Model Pre-training Captures
- Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant
- Computer ScienceTransactions of the Association for Computational…
- 31 December 2019
This work proposes eight reasoning tasks, which conceptually require operations such as comparison, conjunction, and composition, and findings can help future work on designing new datasets, models, and objective functions for pre-training.
Adversarial Removal of Demographic Attributes from Text Data
- Yanai Elazar, Yoav Goldberg
- Computer ScienceConference on Empirical Methods in Natural…
- 20 August 2018
It is shown that demographic information of authors is encoded in—and can be recovered from—the intermediate representations learned by text-based neural classifiers, and the implication is that decisions of classifiers trained on textual data are not agnostic to—and likely condition on—demographic attributes.
Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals
- Yanai Elazar, Shauli Ravfogel, Alon Jacovi, Yoav Goldberg
- PsychologyTransactions of the Association for Computational…
- 7 December 2020
The inability to infer behavioral conclusions from probing results is pointed out, and an alternative method that focuses on how the information is being used is offered, rather than on what information is encoded is offered.
Measuring and Improving Consistency in Pretrained Language Models
- Yanai Elazar, Nora Kassner, Yoav Goldberg
- Computer ScienceTransactions of the Association for Computational…
- 2 February 2021
The creation of PARAREL, a high-quality resource of cloze-style query English paraphrases, and analysis of the representational spaces of PLMs suggest that they have a poor structure and are currently not suitable for representing knowledge in a robust way.
Evaluating Models’ Local Decision Boundaries via Contrast Sets
- Matt Gardner, Yoav Artzi, Ben Zhou
- Computer ScienceFindings
- 6 April 2020
A more rigorous annotation paradigm for NLP that helps to close systematic gaps in the test data, and recommends that the dataset authors manually perturb the test instances in small but meaningful ways that (typically) change the gold label, creating contrast sets.
Do Language Embeddings capture Scales?
- Xikun Zhang, Deepak Ramachandran, Ian Tenney, Yanai Elazar, D. Roth
- Computer ScienceFindings
- 11 October 2020
This work identifies contextual information in pre-training and numeracy as two key factors affecting their performance, and shows that a simple method of canonicalizing numbers can have a significant effect on the results.
Evaluating NLP Models via Contrast Sets
- Matt Gardner, Yoav Artzi, Ben Zhou
- Computer ScienceArXiv
- 6 April 2020
A new annotation paradigm for NLP is proposed that helps to close systematic gaps in the test data, and it is recommended that after a dataset is constructed, the dataset authors manually perturb the test instances in small but meaningful ways that change the gold label, creating contrast sets.
Adversarial Removal of Demographic Attributes Revisited
- Maria Barrett, Yova Kementchedjhieva, Yanai Elazar, Desmond Elliott, Anders Søgaard
- Computer Science, PsychologyConference on Empirical Methods in Natural…
- 1 November 2019
It is shown that a diagnostic classifier trained on the biased baseline neural network also does not generalize to new samples, indicating that it relies on correlations specific to their particular data sample.
Contrastive Explanations for Model Interpretability
- Alon Jacovi, Swabha Swayamdipta, Shauli Ravfogel, Yanai Elazar, Yejin Choi, Yoav Goldberg
- Computer ScienceConference on Empirical Methods in Natural…
- 2 March 2021
The ability of label-contrastive explanations to provide fine-grained interpretability of model decisions is demonstrated, via both high-level abstract concept attribution and low-level input token/span attribution for two NLP classification benchmarks.
...
...