Corpus ID: 235606403

Synthetic Benchmarks for Scientific Research in Explainable Machine Learning

  title={Synthetic Benchmarks for Scientific Research in Explainable Machine Learning},
  author={Yang Liu and Sujay Khandagale and Colin White and Willie Neiswanger},
As machine learning models grow more complex and their applications become more high-stakes, tools for explaining model predictions have become increasingly important. This has spurred a flurry of research in model explainability and has given rise to feature attribution methods such as LIME and SHAP. Despite their widespread use, evaluating and comparing different feature attribution methods remains challenging: evaluations ideally require human studies, and empirical evaluation metrics are… Expand
Deep Neural Networks and Tabular Data: A Survey
This work provides an overview of state-of-the-art deep learning methods for tabular data by categorizing them into three groups: data transformations, specialized architectures, and regularization models. Expand


Entropy and Distance of Random Graphs with Application to Structural Pattern Recognition
  • A. Wong, Manlai You
  • Mathematics, Medicine
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 1985
To synthesize an ensemble of attributed graphs into the distribution of a random graph (or a set of distributions), this work proposes a distance measure between random graphs based on the minimum change of entropy before and after their merging. Expand
ERASER: A Benchmark to Evaluate Rationalized NLP Models
This work proposes the Evaluating Rationales And Simple English Reasoning (ERASER) a benchmark to advance research on interpretable models in NLP, and proposes several metrics that aim to capture how well the rationales provided by models align with human rationales, and also how faithful these rationales are. Expand
A Benchmark for Interpretability Methods in Deep Neural Networks
An empirical measure of the approximate accuracy of feature importance estimates in deep neural networks is proposed and it is shown that some approaches do no better then the underlying method but carry a far higher computational burden. Expand
Generating Contrastive Explanations with Monotonic Attribute Functions
This paper proposes a method that can generate contrastive explanations for deep neural networks where aspects that are in themselves sufficient to justify the classification by the deep model are highlighted, but also new aspects which if added will change the classification. Expand
Towards Robust Interpretability with Self-Explaining Neural Networks
This work designs self-explaining models in stages, progressively generalizing linear classifiers to complex yet architecturally explicit models, and proposes three desiderata for explanations in general – explicitness, faithfulness, and stability. Expand
A Unified Approach to Interpreting Model Predictions
A unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations), which unifies six existing methods and presents new methods that show improved computational performance and/or better consistency with human intuition than previous approaches. Expand
2021) proposes several techniques to relax both assumptions and improve BF-SHAP such as “Gaussian”, “copula”, and “empirical
  • Because the “empirical”
  • 2021
A Survey on Neural Network Interpretability
This survey conducts a comprehensive review of the neural network interpretability research and proposes a novel taxonomy organized along three dimensions: type of engagement (passive vs. active interpretation approaches), the type of explanation, and the focus (from local to global interpretability). Expand
Explaining individual predictions when features are dependent: More accurate approximations to Shapley values
This work extends the Kernel SHAP method to handle dependent features, and proposes a method for aggregating individual Shapley values, such that the prediction can be explained by groups of dependent variables. Expand