• Corpus ID: 235790487

Understanding surrogate explanations: the interplay between complexity, fidelity and coverage

  title={Understanding surrogate explanations: the interplay between complexity, fidelity and coverage},
  author={Rafael Poyiadzi and X. Renard and Thibault Laugel and Ra{\'u}l Santos-Rodr{\'i}guez and Marcin Detyniecki},
This paper analyses the fundamental ingredients behind surrogate explanations to provide a better understanding of their inner workings. We start our exposition by considering global surrogates, describing the trade-off between complexity of the surrogate and fidelity to the black-box being modelled. We show that transitioning from global to local - reducing coverage - allows for more favourable conditions on the Pareto frontier of fidelity-complexity of a surrogate. We discuss the interplay… 

Figures from this paper

Uncertainty Quantification of Surrogate Explanations: an Ordinal Consensus Approach

This paper produces estimates of the uncertainty of a given explanation by measuring the ordinal consensus amongst a set of diverse bootstrapped surrogate explainers and proposes and analyse metrics to aggregate the information contained within the set of explainers through a rating scheme.

Interpretability from a new lens: Integrating Stratification and Domain knowledge for Biomedical Applications

A novel computational strategy for the stratification of biomedical problem datasets into k-fold cross-validation (CVs) and integrating domain knowledge interpretation techniques embedded into the current state-of-the-art IML frameworks is proposed.

Attention-like feature explanation for tabular data

A new method for local and global explanation of the machine learning black-box model predictions by tabular data is proposed. It is implemented as a system called AFEX (Attention-like Feature

On the overlooked issue of defining explanation objectives for local-surrogate explainers

This work reviews the similarities and differences amongst multiple methods for explaining machine learning model predictions, with a particular focus on what information they extract from the model, as this has large impact on the output: the explanation.

How Much Should I Trust You? Modeling Uncertainty of Black Box Explanations

This work develops a novel set of tools for analyzing explanation uncertainty in a Bayesian framework that estimates credible intervals (CIs) that capture the uncertainty associated with each feature importance in local explanations.

Defining Locality for Surrogates in Post-hoc Interpretablity

This paper proposes to generate surrogate-based explanations for individual predictions based on a sampling centered on particular place of the decision boundary, relevant for the prediction to be explained, rather than on the prediction itself as it is classically done.

bLIMEy: Surrogate Prediction Explanations Beyond LIME

This paper demonstrates how to decompose the surrogate explainers family into algorithmically independent and interoperable modules and discusses the influence of these component choices on the functional capabilities of the resulting explainer, using the example of LIME.

Explainers in the Wild: Making Surrogate Explainers Robust to Distortions Through Perception

  • Alexander HepburnRaúl Santos-Rodríguez
  • Computer Science
    2021 IEEE International Conference on Image Processing (ICIP)
  • 2021
This paper proposes a methodology to evaluate the effect of distortions in explanations by embedding perceptual distances that tailor the neighbourhoods used to training surrogate explainers and shows that by operating in this way, it can make the explanations more robust to distortions.

Improving the Quality of Explanations with Local Embedding Perturbations

This work proposes a new neighborhood generation method that first fits a local embedding/subspace around a given instance using the LID of the test instance as the target dimensionality, then generates neighbors in the localembedding and projects them back to the original space.

“Why Should I Trust You?”: Explaining the Predictions of Any Classifier

LIME is proposed, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning aninterpretable model locally varound the prediction.

Local Rule-Based Explanations of Black Box Decision Systems

This paper proposes LORE, an agnostic method able to provide interpretable and faithful explanations for black box outcome explanation, and shows that LORE outperforms existing methods and baselines both in the quality of explanations and in the accuracy in mimicking the black box.

Exploiting patterns to explain individual predictions

PALEX is compared to several state-of-the-art explanation methods over a range of benchmark datasets and finds that it can identify explanations with both high precision and high recall.

Second thoughts on the bootstrap

This brief review article is appearing in the issue of Statistical Science that marks the 25th anniversary of the bootstrap. It concerns some of the theoretical and methodological aspects of the