• Publications
  • Influence
Universal Adversarial Triggers for Attacking and Analyzing NLP
Adversarial examples highlight model vulnerabilities and are useful for evaluation and interpretation. We define universal adversarial triggers: input-agnostic sequences of tokens that trigger a
Universal Adversarial Triggers for NLP
Adversarial examples highlight model vulnerabilities and are useful for evaluation and interpretation. We define universal adversarial triggers: input-agnostic sequences of tokens that trigger a
Deduplicating Training Data Mitigates Privacy Risks in Language Models
TLDR
The rate at which language models regenerate training sequences is superlinearly related to a sequence’s count in the training set and it is found that after applying methods to deduplicate training data, language models are considerably more secure against privacy attacks.
Music Enhancement via Image Translation and Vocoding
TLDR
It is found that this approach to music enhancement outperforms baselines which use classical methods for mel-spectrogram inversion and an end-to-end approach directly mapping noisy waveforms to clean waveforms.