# SIRUS: making random forests interpretable

@article{Bnard2019SIRUSMR, title={SIRUS: making random forests interpretable}, author={Cl{\'e}ment B{\'e}nard and G{\'e}rard Biau and S{\'e}bastien Da Veiga and Erwan Scornet}, journal={ArXiv}, year={2019}, volume={abs/1908.06852} }

State-of-the-art learning algorithms, such as random forests or neural networks, are often qualified as "black-boxes" because of the high number and complexity of operations involved in their prediction mechanism. This lack of interpretability is a strong limitation for applications involving critical decisions, typically the analysis of production processes in the manufacturing industry. In such critical contexts, models have to be interpretable, i.e., simple, stable, and predictive. To…

## 8 Citations

### Interpretable Random Forests via Rule Extraction

- Computer ScienceAISTATS
- 2021

This work introduces SIRUS (Stable and Interpretable RUle Set) for regression, a stable rule learning algorithm which takes the form of a short and simple list of rules which combines a simple structure with a remarkable stable behavior when data is perturbed.

### Visualisation and knowledge discovery from interpretable models

- Computer Science2020 International Joint Conference on Neural Networks (IJCNN)
- 2020

The newly developed classifiers helped in investigating the complexities of the UCI dataset as a multiclass problem and were comparable to those reported in literature for this dataset, with additional value of interpretability when the dataset was treated as a binary class problem.

### Random forests for global sensitivity analysis: A selective review

- Computer ScienceReliab. Eng. Syst. Saf.
- 2021

### A rigorous method to compare interpretability

- Computer Science
- 2020

The aim of this article is to propose a rigorous mathematical definition of the concept of interpretability, allowing fair comparisons between any rule-based algorithms, built from three notions, each being quantitatively measured by a simple formula: predictivity, stability and simplicity.

### MP-Boost: Minipatch Boosting via Adaptive Feature and Observation Sampling

- Computer Science2021 IEEE International Conference on Big Data and Smart Computing (BigComp)
- 2021

Boosting methods are among the best generalpurpose and off-the-shelf machine learning approaches, gaining widespread popularity. In this paper, we seek to develop a boosting method that yields…

### Robust and Heterogenous Odds Ratio: Estimating Price Sensitivity for Unbought Items

- EconomicsManufacturing & Service Operations Management
- 2022

Problem definition: Mining for heterogeneous responses to an intervention is a crucial step for data-driven operations, for instance, to personalize treatment or pricing. We investigate how to…

### Predicting Cell-Penetrating Peptides: Building and Interpreting Random Forest based prediction Models

- BiologybioRxiv
- 2020

This work builds prediction models for CPPs exploring features covering a range of properties based on amino acid sequences, using Random forest classifiers which are often more interpretable than other ensemble machine learning algorithms.

### A framework for the risk prediction of avian influenza occurrence: An Indonesian case study

- BusinessPloS one
- 2021

A framework for the prediction of the occurrence and spread of avian influenza events in a geographical area is proposed and suggested that the proposed framework could act as a tool to gain a broad understanding of the drivers ofAvian influenza epidemics and may facilitate the Prediction of future disease events.

## References

SHOWING 1-10 OF 57 REFERENCES

### SIRUS: Stable and Interpretable RUle Set for classification

- Computer Science
- 2020

SIRUS (Stable and Interpretable RUle Set), a new classification algorithm based on random forests, which takes the form of a short list of rules, achieves a remarkable stability improvement over cutting-edge methods.

### ENDER: a statistical framework for boosting decision rules

- Computer ScienceData Mining and Knowledge Discovery
- 2010

A learning algorithm, called ENDER, which constructs an ensemble of decision rules, which is tailored for regression and binary classification problems and uses the boosting approach for learning, which can be treated as generalization of sequential covering.

### Interpretable Decision Sets: A Joint Framework for Description and Prediction

- Computer ScienceKDD
- 2016

This work proposes interpretable decision sets, a framework for building predictive models that are highly accurate, yet also highly interpretable, and provides a new approach to interpretable machine learning that balances accuracy, interpretability, and computational efficiency.

### Generating Accurate Rule Sets Without Global Optimization

- Computer ScienceICML
- 1998

This paper presents an algorithm for inferring rules by repeatedly generating partial decision trees, thus combining the two major paradigms for rule generation—creating rules from decision trees and the separate-and-conquer rule-learning technique.

### Node harvest

- Computer Science
- 2009

When choosing a suitable technique for regression and classification with multivariate predictor variables, one is often faced with a tradeoff between interpretability and high predictive accuracy.…

### Definitions, methods, and applications in interpretable machine learning

- Computer ScienceProceedings of the National Academy of Sciences
- 2019

This work defines interpretability in the context of machine learning and introduces the predictive, descriptive, relevant (PDR) framework for discussing interpretations, and introduces 3 overarching desiderata for evaluation: predictive accuracy, descriptive accuracy, and relevancy.

### Random Forests

- Computer ScienceMachine Learning
- 2004

Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.

### Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model

- Computer ScienceArXiv
- 2015

A generative model called Bayesian Rule Lists is introduced that yields a posterior distribution over possible decision lists that employs a novel prior structure to encourage sparsity and has predictive accuracy on par with the current top algorithms for prediction in machine learning.

### Interpretable machine learning: definitions, methods, and applications

- Computer ScienceArXiv
- 2019

This paper first defines interpretability in the context of machine learning and place it within a generic data science life cycle, and introduces the Predictive, Descriptive, Relevant (PDR) framework, consisting of three desiderata for evaluating and constructing interpretations.

### PREDICTIVE LEARNING VIA RULE ENSEMBLES

- Computer Science
- 2008

General regression and classification models are constructed as linear combinations of simple rules derived from the data. Each rule consists of a conjunction of a small number of simple statements…