Evaluating the Robustness of Neural Language Models to Input Perturbations

  title={Evaluating the Robustness of Neural Language Models to Input Perturbations},
  author={Milad Moradi and Matthias Samwald},
High-performance neural language models have obtained state-of-the-art results on a wide range of Natural Language Processing (NLP) tasks. However, results for common benchmark datasets often do not reflect model reliability and robustness when applied to noisy, real-world data. In this study, we design and implement various types of character-level and word-level perturbation methods to simulate realistic scenarios in which input texts may be slightly noisy or different from the data… 

Interpreting the Robustness of Neural NLP Models to Textual Perturbations

This work conducts extensive experiments with four prominent NLP models — TextRNN, BERT, RoBERTa and XLNet — over eight types of textual perturbations on three datasets, showing that a model which is better at identifying a perturbation becomes worse at ignoring such a perturgation at test time (lower robustness), providing empirical support for the hypothesis.

Robust Natural Language Processing: Recent Advances, Challenges, and Future Directions

This paper presents a structured overview of NLP robustness research by summarizing the literature in a systemic way across various dimensions, and takes a deep-dive into the various dimensions of robustness, across techniques, metrics, embedding, and benchmarks.

On Sensitivity of Deep Learning Based Text Classification Algorithms to Practical Input Perturbations

This work shows that these deep learning approaches including BERT are sensitive to such legitimate input perturbations on four standard benchmark datasets SST2, TREC-6, BBC News, and tweet eval, and observes that BERT is more susceptible to the removal of tokens as compared to the addition of tokens.

Invariant Language Modeling

In a series of controlled experiments, the ability of the proposed invariant language modeling framework to remove structured noise, ignore specific spurious correlations without affecting global performance, and achieve better out-of-domain generalization is demonstrated.

Improving the robustness and accuracy of biomedical language models through adversarial training

Prediction Difference Regularization against Perturbation for Neural Machine Translation

This paper utilizes prediction difference for ground-truth tokens to analyze the fitting of token-level samples and finds that under-fitting is almost as common as over-fitting, so prediction difference regularization (PD-R) is introduced, a simple and effective method that can reduce over- fitting and under- fitting at the same time.

Causally Estimating the Sensitivity of Neural NLP Models to Spurious Features

This work quantifies model sensitivity to spurious features with a causal estimand, dubbed CENT, which draws on the concept of average treatment effect from the causality literature, to hypothesize and validate that models that are more sensitive to a spurious feature will be less robust against perturbations with this feature during inference.

Evaluation Gaps in Machine Learning Practice

The evaluation gaps between the idealized breadth of evaluation concerns and the observed narrow focus of actual evaluations are examined, pointing the way towards more contextualized evaluation methodologies for robustly examining the trustworthiness of ML models.

Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text

This paper surveys the issues with human and automatic model evaluations and with commonly used datasets in NLG that have been pointed out over the past 20 years and lays out a long-term vision for NLG evaluation and proposes concrete steps to improve their evaluation processes.

PSSAT: A Perturbed Semantic Structure Awareness Transferring Method for Perturbation-Robust Slot Filling

Experimental results show that the proposed perturbed semantic structure awareness transferring method consistently outperforms the previous basic methods and gains strong generalization while preventing the model from memorizing inherent patterns of entities and contexts.



How Robust Are Character-Based Word Embeddings in Tagging and MT Against Wrod Scramlbing or Randdm Nouse?

The robustness of NLP against perturbed word forms is investigated by considering different noise distributions (one type of noise, combination of noise types) and mismatched noise distributions for training and testing.

Perturbation Sensitivity Analysis to Detect Unintended Model Biases

A generic evaluation framework, Perturbation Sensitivity Analysis, is proposed, which detects unintended model biases related to named entities, and requires no new annotations or corpora to be employed.

Evaluating Robustness to Input Perturbations for Neural Machine Translation

This paper proposes additional metrics which measure the relative degradation and changes in translation when small perturbations are added to the input and shows a clear trend of improved robustness when subword regularization methods are used.

Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

It is concluded that LSTMs can capture a non-trivial amount of grammatical structure given targeted supervision, but stronger architectures may be required to further reduce errors; furthermore, the language modeling signal is insufficient for capturing syntax-sensitive dependencies, and should be supplemented with more direct supervision if such dependencies need to be captured.

Improving the Reliability of Deep Neural Networks in NLP: A Review

Towards Robustness Against Natural Language Word Substitutions

A novel Adversarial Sparse Convex Combination (ASCC) method is introduced, which model the word substitution attack space as a convex hull and leverages a regularization term to enforce perturbation towards an actual substitution, thus aligning the modeling better with the discrete textual space.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

It is shown how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks.

RoBERTa: A Robustly Optimized BERT Pretraining Approach

It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList

Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models, while alternative approaches for evaluating models