Shared Interest: Measuring Human-AI Alignment to Identify Recurring Patterns in Model Behavior

  title={Shared Interest: Measuring Human-AI Alignment to Identify Recurring Patterns in Model Behavior},
  author={Angie Boggust and Benjamin Hoover and Arvindmani Satyanarayan and Hendrik Strobelt},
  journal={CHI Conference on Human Factors in Computing Systems},
Saliency methods — techniques to identify the importance of input features on a model’s output — are a common step in understanding neural network behavior. However, interpreting saliency requires tedious manual inspection to identify and aggregate patterns in model behavior, resulting in ad hoc or cherry-picked analysis. To address these concerns, we present Shared Interest: metrics for comparing model reasoning (via saliency) to human reasoning (via ground truth annotations). By providing… 

Figures from this paper

Beyond Faithfulness: A Framework to Characterize and Compare Saliency Methods
This work describes a framework of nine dimensions to characterize and compare the properties of saliency methods, and identifies opportunities for future work, including filling gaps in the landscape and developing new evaluation metrics.


Visualizing Higher-Layer Features of a Deep Network
This paper contrast and compare several techniques applied on Stacked Denoising Autoencoders and Deep Belief Networks, trained on several vision datasets, and shows that good qualitative interpretations of high level features represented by such models are possible at the unit level.
Deep Residual Learning for Image Recognition
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Captum: A unified and generic model interpretability library for PyTorch
An interactive visualization tool called Captum Insights that is built on top of Captum library and allows sample-based model debugging and visualization using feature importance metrics and is designed for easy understanding and use.
What made you do this? Understanding black-box decisions with sufficient input subsets
This work proposes sufficient input subsets that identify minimal subsets of features whose observed values alone suffice for the same decision to be reached, even if all other input feature values are missing.
Axiomatic Attribution for Deep Networks
We study the problem of attributing the prediction of a deep network to its input features, a problem previously studied by several other works. We identify two fundamental axioms— Sensitivity and
Skin lesion analysis toward melanoma detection: A challenge at the 2017 International symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC)
The design, implementation, and results of the latest installment of the dermoscopic image analysis benchmark challenge are described, to support research and development of algorithms for automated diagnosis of melanoma, the most lethal skin cancer.
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
LIME is proposed, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning aninterpretable model locally varound the prediction.
Adam: A Method for Stochastic Optimization
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Quick Shift and Kernel Methods for Mode Seeking
We show that the complexity of the recently introduced medoid-shift algorithm in clustering N points is O(N 2), with a small constant, if the underlying distance is Euclidean. This makes medoid shift
Adam: AMethod for Stochastic Optimization
  • In Proceedings of the International Conference on Learning Representations (ICLR). San Diego, USA
  • 2015