Self-explaining AI as an Alternative to Interpretable AI

  title={Self-explaining AI as an Alternative to Interpretable AI},
  author={Daniel C. Elton},
  • D. Elton
  • Published in AGI 12 February 2020
  • Computer Science
The ability to explain decisions made by AI systems is highly sought after, especially in domains where human lives are at stake such as medicine or autonomous vehicles. While it is often possible to approximate the input-output relations of deep neural networks with a few human-understandable rules, the discovery of the double descent phenomena suggests that such approximations do not accurately capture the mechanism by which deep neural networks work. Double descent indicates that deep neural… 

Teaching the Machine to Explain Itself using Domain Knowledge

JOEL is a neural network-based framework to jointly learn a decision-making task and associated explanations that convey domain knowledge that very much resemble the experts' own reasoning, tailored to human-in-the-loop domain experts that lack deep technical ML knowledge.

The Achilles Heel Hypothesis: Pitfalls for AI Systems via Decision Theoretic Adversaries

The Achilles Heel hypothesis is presented which states that highly-effective goal-oriented systems -- even ones that are potentially superintelligent -- may nonetheless have stable decision theoretic delusions which cause them to make obviously irrational decisions in adversarial settings.

Conjecturing-Based Computational Discovery of Patterns in Data

This work proposes the use of a conjecturing machine that generates feature relationships in the form of bounds for numerical features and boolean expressions for nominal features that are ignored by machine learning algorithms.

Link Prediction using Graph Neural Networks for Master Data Management

Novel methods for anonymizing data, model training, explainability and verification for Link Prediction in Master Data Management, and discuss the results are introduced.

An analysis of gamma ray data collected at traffic intersections in Northern Virginia

The analysis approach used here is described and the results in terms of radioisotope classes and frequency patterns over day-of-week and time- of-day spans are discussed.

Mutual information-based group explainers with coalition structure for machine learning model explanations

A feature grouping technique that employs an information-theoretic measure of dependence and design appropriate groups explainers is proposed to unify the two points of view under predictor dependencies and to reduce the complexity of group explanations.

XAI & I: Self-explanatory AI facilitating mutual understanding between AI and human experts

Co-evolutionary hybrid intelligence is a key concept for the world intellectualization

The inconsistency of the approach to the development of artificial intelligence as an independent tool is shown; to describe the logic and concept of intelligence development regardless of its substrate: a human or a machine and to prove that the co-evolutionary hybridization of the machine and human intelligence will make it possible to reach a solution for the problems inaccessible to humanity so far.

Providing Error Detection for Deep Learning Image Classifiers Using Self-Explainability

A self-explainable Deep Learn (SE-DL) system for an image classification problem that performs self-error detection and a concept selection methodology for scoring all concepts and selecting a subset of them based on their contribution to the error detection performance of the proposed SE-DL system.



Interpretable Explanations of Black Boxes by Meaningful Perturbation

A general framework for learning different kinds of explanations for any black box algorithm is proposed and the framework to find the part of an image most responsible for a classifier decision is specialised.

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

LIME is proposed, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning aninterpretable model locally varound the prediction.

Encoding Visual Attributes in Capsules for Explainable Medical Diagnoses

This is the first study to investigate capsule networks for making predictions based on radiologist-level interpretable attributes and its applications to medical image diagnosis and it is demonstrated a simple 2D capsule network can outperform a state-of-the-art deep dense dual-path 3D CNN at capturing visually-interpretable high-level attributes and malignancy prediction, while providing malignancies prediction scores approaching that of non-explainable 3DCNNs.

Towards Robust Interpretability with Self-Explaining Neural Networks

This work designs self-explaining models in stages, progressively generalizing linear classifiers to complex yet architecturally explicit models, and proposes three desiderata for explanations in general – explicitness, faithfulness, and stability.

Synthesizing the preferred inputs for neurons in neural networks via deep generator networks

This work dramatically improves the qualitative state of the art of activation maximization by harnessing a powerful, learned prior: a deep generator network (DGN), which generates qualitatively state-of-the-art synthetic images that look almost real.

Distilling a Neural Network Into a Soft Decision Tree

A way of using a trained neural net to create a type of soft decision tree that generalizes better than one learned directly from the training data is described.

Definitions, methods, and applications in interpretable machine learning

This work defines interpretability in the context of machine learning and introduces the predictive, descriptive, relevant (PDR) framework for discussing interpretations, and introduces 3 overarching desiderata for evaluation: predictive accuracy, descriptive accuracy, and relevancy.

A Unified Approach to Interpreting Model Predictions

A unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations), which unifies six existing methods and presents new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.

Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks

It is contended that over-parameterized blind fitting presents a radical challenge to many of the underlying assumptions and practices in computational neuroscience and cognitive psychology, which informs longstanding debates and establishes unexpected links with evolution, ecological psychology, and artificial life.