On The Reasons Behind Decisions

@article{Darwiche2020OnTR,
  title={On The Reasons Behind Decisions},
  author={Adnan Darwiche and Auguste Hirth},
  journal={ArXiv},
  year={2020},
  volume={abs/2002.09284}
}
Recent work has shown that some common machine learning classifiers can be compiled into Boolean circuits that have the same input-output behavior. We present a theory for unveiling the reasons behind the decisions made by Boolean classifiers and study some of its theoretical and practical implications. We define notions such as sufficient, necessary and complete reasons behind decisions, in addition to classifier and decision bias. We show how these notions can be used to evaluate… 

Figures from this paper

Sufficient reasons for classifier decisions in the presence of constraints
TLDR
The main idea is to view classifiers in the presence of constraints as describing partial Boolean functions, i.e., that are undefined on instances that do not satisfy the constraints, and it is proved that this simple idea results in reasons that are no less (and sometimes more) succinct.
On Symbolically Encoding the Behavior of Random Forests
TLDR
This work addresses systems with discrete inputs and outputs, including ones with discretized continuous variables as in systems based on decision trees and focuses on the suitability of encodings for computing prime implicants, which have recently played a central role in explaining the decisions of machine learning systems.
On the Explanatory Power of Decision Trees
TLDR
It is proved that the set of all sufficient reasons of minimal size for an instance given a decision tree can be exponentially larger than the size of the input (the instance and the decision tree) and generating the full set of sufficient reasons can be out of reach.
On the Tractability of Explaining Decisions of Classifiers
TLDR
This work investigates the computational complexity of providing a formally-correct and minimal explanation of a decision taken by a classifier and shows that tractable classes coincide for abductive and contrastive explanations in the constrained or unconstrained settings.
On the Computational Intelligibility of Boolean Classifiers
TLDR
The existence of large intelligibility gap between the families of classifiers is shown, where none of them is tractable for DNF formulae, decision lists, random forests, boosted decision trees, Boolean multilayer perceptrons, and binarized neural networks.
On Tractable XAI Queries based on Compiled Representations
TLDR
This paper defines new explanation and/or verification queries about classifiers and shows how they can be addressed by combining queries and transformations about the associated Boolean circuits.
Probabilistic Sufficient Explanations
TLDR
Probabilistic sufficient explanations are introduced, which formulate explaining an instance of classification as choosing the “simplest” subset of features such that only observing those features is “sufficient” to explain the classification.
On Deciding Feature Membership in Explanations of SDD&Related Classifiers
TLDR
The paper proves that any classifier for which an explanation can be computed in polynomial time, then deciding feature membership in an explanationCan be decided with one NP oracle call and proposes propositional encodings for classifiers represented with Sentential Decision Diagrams and for other related propositional languages.
Eliminating The Impossible, Whatever Remains Must Be True
TLDR
This paper shows how to use existing rule induction techniques to efficiently extract background information from a dataset, and also how to report which background information was used to make an explanation, allowing a human to examine it if they doubt the correctness of the explanation.
SAT-Based Rigorous Explanations for Decision Lists
TLDR
This paper shows that computing explanations for DLs is computationally hard, and proposes propositional encodings for computing abductive explanations and contrastive explanations of DLs and investigates the practical efficiency of a MARCO-like approach for enumerating explanations.
...
...

References

SHOWING 1-10 OF 36 REFERENCES
Formal Verification of Bayesian Network Classifiers
TLDR
It is shown in this paper that this approach based on first compiling a given classifier into a tractable representation called an Ordered Decision Diagram also gives the ability to verify the behavior of classifiers.
Compiling Bayesian Network Classifiers into Decision Graphs
TLDR
An algorithm is proposed for compiling Bayesian network classifiers into decision graphs that mimic the input and output behavior of the classifiers, which are tractable and can be exponentially smaller in size than decision trees.
A Symbolic Approach to Explaining Bayesian Network Classifiers
We propose an approach for explaining Bayesian network classifiers, which is based on compiling such classifiers into decision functions that have a tractable and symbolic form. We introduce two
Reasoning about Bayesian Network Classifiers
TLDR
This paper presents an algorithm for converting any naive Bayes classifier into an ODD, and it is shown theoretically and experimentally that this algorithm can give us an O DD that is tractable in size cvcn given an intractable number of instances.
Abduction-Based Explanations for Machine Learning Models
TLDR
A constraint-agnostic solution for computing explanations for any ML model that exploits abductive reasoning, and imposes the requirement that the ML model can be represented as sets of constraints using some target constraint reasoning system for which the decision problem can be answered with some oracle.
On Validating, Repairing and Refining Heuristic ML Explanations
TLDR
Earlier work to the case of boosted trees is extended and the quality of explanations obtained with state-of-the-art heuristic approaches are assessed.
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
TLDR
LIME is proposed, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning aninterpretable model locally varound the prediction.
The Language of Search
TLDR
This paper shows that several versions of exhaustive DPLL search correspond to such well-known languages as FBDD, OBDD, and a precisely-defined subset of d-DNNF.
On Relating Explanations and Adversarial Examples
TLDR
It is demonstrated that explanations and adversarial examples are related by a generalized form of hitting set duality, which extends earlier work on hitting setDuality observed in model-based diagnosis and knowledge compilation.
Decomposable negation normal form
TLDR
It is shown that DNNF is universal; supports a rich set of polynomial--time logical operations; is more space-efficient than OBDDs; and is very simple as far as its structure and algorithms are concerned.
...
...