Corpus ID: 219955933

How does this interaction affect me? Interpretable attribution for feature interactions

@article{Tsang2020HowDT,
  title={How does this interaction affect me? Interpretable attribution for feature interactions},
  author={Michael Tsang and Sirisha Rambhatla and Y. Liu},
  journal={ArXiv},
  year={2020},
  volume={abs/2006.10965}
}
Machine learning transparency calls for interpretable explanations of how inputs relate to predictions. Feature attribution is a way to analyze the impact of features on predictions. Feature interactions are the contextual dependence between features that jointly impact predictions. There are a number of methods that extract feature interactions in prediction models; however, the methods that assign attributions to interactions are either uninterpretable, model-specific, or non-axiomatic. We… Expand
Fine-grained Interpretation and Causation Analysis in Deep NLP Models
Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals
Towards Rigorous Interpretations: a Formalisation of Feature Attribution
Explaining Explanations: Axiomatic Feature Interactions for Deep Networks
PredDiff: Explanations and Interactions from Conditional Expectations
Refining Neural Networks with Compositional Explanations

References

SHOWING 1-10 OF 64 REFERENCES
Can I trust you more? Model-Agnostic Hierarchical Explanations
Axiomatic Attribution for Deep Networks
A Unified Approach to Interpreting Model Predictions
Hierarchical interpretations for neural network predictions
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs
Consistent Individualized Feature Attribution for Tree Ensembles
...
1
2
3
4
5
...