Imodels: a Python Package for Fitting Interpretable Models

@article{Singh2021ImodelsAP,
  title={Imodels: a Python Package for Fitting Interpretable Models},
  author={Chandan Singh and Keyan Nasseri and Yan Shuo Tan and Tiffany M. Tang and Bin Yu},
  journal={J. Open Source Softw.},
  year={2021},
  volume={6},
  pages={3192}
}
imodels is a Python package for concise, transparent, and accurate predictive modeling. It provides users a simple interface for fitting and using state-of-the-art interpretable models, all compatible with scikit-learn (Pedregosa et al., 2011). These models can often replace black-box models while improving interpretability and computational efficiency, all without sacrificing predictive accuracy. In addition, the package provides a framework for developing custom tools and rule-based models… 

Figures from this paper

Interpreting and improving deep-learning models with reality checks

TLDR
This chapter covers recent work aiming to interpret models by attributing importance to features and feature groups for a single prediction, and shows how these attributions can be used to directly improve the generalization of a neural network or to distill it into a simple model.

Fast Interpretable Greedy-Tree Sums (FIGS)

TLDR
FIGS generalizes the CART algorithm to simultaneously grow a flexible number of trees in a summation, and is able to avoid repeated splits, and often provides more concise decision rules than fitted decision trees, without sacrificing predictive performance.

VeridicalFlow: a Python package for building trustworthy data science pipelines with PCS

TLDR
VeridicalFlow is a Python package that simplifies building reproducible and trustworthy data science pipelines using the PCS framework by screening models for predictive performance, helping automate computation, and facilitating stability analysis.

Quantifying Explainability in NLP and Analyzing Algorithms for Performance-Explainability Tradeoff

TLDR
This work demonstrates various visualization techniques for fully interpretable methods as well as model-agnostic post hoc attributions, and provides a generalized method for evaluating the quality of explanations using infidelity and local Lipschitz across model types from logistic regression to BERT variants.

TE2Rules: Extracting Rule Lists from Tree Ensembles

TLDR
A novel approach to convert a TE trained for a binary classification task, to a rule list that is a global equivalent to the TE and is comprehensible for a human and a fast alternative to the state-of-the-art rule-based instance-level outcome explanation techniques.

Adaptive wavelet distillation from neural networks through interpretations

TLDR
Adapt wavelet distillation (AWD) is proposed, a method which aims to distill information from a trained neural network into a wavelet transform and yields a scientifically interpretable and concise model which gives predictive performance better than state-of-the-art neural networks.

Hierarchical Shrinkage: improving the accuracy and interpretability of tree-based methods

TLDR
Hierarchical Shrinkage (HS), a post-hoc algorithm that does not modify the tree structure, and instead regularizes the tree by shrinking the prediction over each node towards the sample means of its ancestors, is introduced.

Group Probability-Weighted Tree Sums for Interpretable Modeling of Heterogeneous Data

TLDR
An instance-weighted tree-sum method that effectively pools data across diverse groups to output a concise, rule-based model that achieves state-of-the-art prediction performance on important clinical datasets.

POTATO: exPlainable infOrmation exTrAcTion framewOrk

TLDR
POTATO is a taskand languageindependent framework for human-in-the-loop (HITL) learning of rule-based text classifiers using graph-based features and is applied in projects across domains and languages, including classification tasks on German legal text and English social media data.

Seven Principles for Rapid-Response Data Science: Lessons Learned from Covid-19 Forecasting

TLDR
Seven principles are described in the context of working with Response4Life, a then-new nonprofit organization, to illustrate their necessity in dealing with problems that require rapid response, often resembling agile software development.

References

SHOWING 1-10 OF 12 REFERENCES

Scikit-learn: Machine Learning in Python

Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing

Definitions, methods, and applications in interpretable machine learning

TLDR
This work defines interpretability in the context of machine learning and introduces the predictive, descriptive, relevant (PDR) framework for discussing interpretations, and introduces 3 overarching desiderata for evaluation: predictive accuracy, descriptive accuracy, and relevancy.

PREDICTIVE LEARNING VIA RULE ENSEMBLES

General regression and classification models are constructed as linear combinations of simple rules derived from the data. Each rule consists of a conjunction of a small number of simple statements

Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model

TLDR
A generative model called Bayesian Rule Lists is introduced that yields a posterior distribution over possible decision lists that employs a novel prior structure to encourage sparsity and has predictive accuracy on par with the current top algorithms for prediction in machine learning.

Supersparse linear integer models for optimized medical scoring systems

TLDR
This paper provides bounds on the testing and training accuracy of SLIM scoring systems, and presents a new data reduction technique that can improve scalability by eliminating a portion of the training data beforehand.

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

  • C. Rudin
  • Computer Science
    Nat. Mach. Intell.
  • 2019
TLDR
This Perspective clarifies the chasm between explaining black boxes and using inherently interpretable models, outlines several key reasons why explainable black boxes should be avoided in high-stakes decisions, identifies challenges to interpretable machine learning, and provides several example applications whereinterpretable models could potentially replace black box models in criminal justice, healthcare and computer vision.

Classification and regression trees

  • W. Loh
  • Computer Science
    WIREs Data Mining Knowl. Discov.
  • 2011
TLDR
This article gives an introduction to the subject of classification and regression trees by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples.

Very Simple Classification Rules Perform Well on Most Commonly Used Datasets

  • R. Holte
  • Computer Science
    Machine Learning
  • 2004
TLDR
On most datasets studied, the best of very simple rules that classify examples on the basis of a single attribute is as accurate as the rules induced by the majority of machine learning systems.

Interpretable machine learning: A guide for making black box models explainable

  • Lulu. com. https://christophm.github.io/interpretable-ml-book/
  • 2020

Skope-rules

  • GitHub repository. GitHub. https://github. com/scikit-learn-contrib/skope-rules
  • 2021