• Publications
  • Influence
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
LIME is proposed, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning aninterpretable model locally varound the prediction. Expand
Anchors: High-Precision Model-Agnostic Explanations
We introduce a novel model-agnostic system that explains the behavior of complex models with high-precision rules called anchors, representing local, “sufficient” conditions for predictions. WeExpand
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs
A new reading comprehension benchmark, DROP, which requires Discrete Reasoning Over the content of Paragraphs, and presents a new model that combines reading comprehension methods with simple numerical reasoning to achieve 51% F1. Expand
Knowledge Enhanced Contextual Word Representations
After integrating WordNet and a subset of Wikipedia into BERT, the knowledge enhanced BERT (KnowBert) demonstrates improved perplexity, ability to recall facts as measured in a probing task and downstream performance on relationship extraction, entity typing, and word sense disambiguation. Expand
Model-Agnostic Interpretability of Machine Learning
This paper argues for explaining machine learning predictions using model-agnostic approaches, treating the machine learning models as black-box functions, which provide crucial flexibility in the choice of models, explanations, and representations, improving debugging, comparison, and interfaces for a variety of users and models. Expand
Generating Natural Adversarial Examples
This paper proposes a framework to generate natural and legible adversarial examples that lie on the data manifold, by searching in semantic space of dense and continuous data representation, utilizing the recent advances in generative adversarial networks. Expand
Design Challenges for Entity Linking
This work analyzes differences between several versions of the EL problem and presents a simple yet effective, modular, unsupervised system, called Vinculum, for entity linking, and elucidate key aspects of the system that include mention extraction, candidate generation, entity type prediction, entity coreference, and coherence. Expand
Semantically Equivalent Adversarial Rules for Debugging NLP models
This work presents semantically equivalent adversaries (SEAs) – semantic-preserving perturbations that induce changes in the model’s predictions that induce adversaries on many instances that are extremely similar semantically. Expand
Universal Adversarial Triggers for Attacking and Analyzing NLP
Adversarial examples highlight model vulnerabilities and are useful for evaluation and interpretation. We define universal adversarial triggers: input-agnostic sequences of tokens that trigger aExpand
Wikilinks: A Large-scale Cross-Document Coreference Corpus Labeled via Links to Wikipedia
This paper presents a method for automatically gathering massive amounts of naturally-occurring cross-document reference data based on finding hyperlinks to Wikipedia from a web crawl and using anchor text as mentions, and presents the Wikilinks dataset comprising of 40 million mentions over 3 million entities. Expand