# Fitting Prediction Rule Ensembles with R Package pre

@article{Fokkema2020FittingPR, title={Fitting Prediction Rule Ensembles with R Package pre}, author={Marjolein Fokkema}, journal={Journal of Statistical Software}, year={2020} }

Prediction rule ensembles (PREs) are sparse collections of rules, offering highly interpretable regression and classification models. [... ] Key Result Results indicate that pre derives ensembles with predictive accuracy comparable to that of random forests, while using a smaller number of variables for prediction. Expand

## Figures and Tables from this paper

## 19 Citations

Improved prediction rule ensembling through model-based data generation

- Computer ScienceArXiv
- 2021

The use of surrogacy models can substantially improve the sparsity of PRE, while retaining predictive accuracy, especially through the use of a nested surrogacy approach.

Fitting prediction rule ensembles to psychological research data: An introduction and tutorial.

- Psychology, Computer SciencePsychological methods
- 2020

The methodology is introduced and how PREs can be fitted using the R package pre is shown through several real-data examples from psychological research, illustrating a number of features of package pre that may be particularly useful for applications in psychology.

Linear Aggregation in Tree-based Estimators

- Computer Science
- 2019

A new algorithm is introduced which finds the best axis-aligned split to fit optimal linear aggregation functions on the corresponding nodes and implement this method in the provably fastest way, enabling to create more interpretable trees and obtain better predictive performance on a wide range of data sets.

SIRUS: making random forests interpretable

- Computer ScienceArXiv
- 2019

SIRUS (Stable and Interpretable RUle Set), a new classification algorithm based on random forests, which takes the form of a short list of rules, achieves a remarkable stability improvement over cutting-edge methods.

GPSRL: Learning Semi-Parametric Bayesian Survival Rule Lists from Heterogeneous Patient Data

- Computer Science2020 25th International Conference on Pattern Recognition (ICPR)
- 2021

This paper proposes a new semi-parametric Bayesian Survival Rule List model, which derives a rule-based decision-making approach, while within the regime defined by each rule, survival risk is modelled via a Gaussian process latent variable model.

Differentiating mania/hypomania from happiness using a machine learning analytic approach.

- PsychologyJournal of affective disorders
- 2020

Learning Interpretable Rules Contributing to Maximal Fuel Rate Flow Consumption in an Aircraft using Rule Based Algorithms

- Computer Science2020 IEEE International Conference for Innovation in Technology (INOCON)
- 2020

The main aim of this paper was to extract interpretable and visually justifiable rules for the fuel intake in each phase of flight using a new rule based algorithm called Generalized Linear Rules Model, which is still under research in the machine learning space.

Understanding the complexity of sepsis mortality prediction via rule discovery and analysis: a pilot study

- MedicineBMC Medical Informatics and Decision Making
- 2021

Glasgow Coma Scale, serum potassium, and serum bilirubin are found to be the most important risk factors for predicting patient death in sepsis patients.

Melancholia defined with the precision of a machine.

- Psychology, MedicineJournal of affective disorders
- 2020

Differentiation of bipolar disorder versus borderline personality disorder: A machine learning approach.

- PsychologyJournal of affective disorders
- 2021

## References

SHOWING 1-10 OF 39 REFERENCES

PREDICTIVE LEARNING VIA RULE ENSEMBLES

- Computer Science
- 2008

General regression and classification models are constructed as linear combinations of simple rules derived from the data. Each rule consists of a conjunction of a small number of simple statementsâ€¦

MODIFIED RULE ENSEMBLE METHOD FOR BINARY DATA AND ITS APPLICATIONS

- Computer Science
- 2014

This study solved the excess pruning problem by constructing RuleFit within a logistic regression framework, weighting the base learners by elastic net, and demonstrated higher predictive performance than the original RuleFit model.

Solving Regression by Learning an Ensemble of Decision Rules

- Computer ScienceICAISC
- 2008

A novel decision rule induction algorithm for solving the regression problem and the prediction model in the form of an ensemble of decision rules is powerful, which is shown by results of the experiment presented in the paper.

Node harvest

- Computer Science
- 2009

When choosing a suitable technique for regression and classification with multivariate predictor variables, one is often faced with a tradeoff between interpretability and high predictive accuracy.â€¦

An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests.

- Computer SciencePsychological methods
- 2009

The aim of this work is to introduce the principles of the standard recursive partitioning methods as well as recent methodological improvements, to illustrate their usage for low and high-dimensional data exploration, but also to point out limitations of the methods and potential pitfalls in their practical application.

ENDER: a statistical framework for boosting decision rules

- Computer ScienceData Mining and Knowledge Discovery
- 2010

A learning algorithm, called ENDER, which constructs an ensemble of decision rules, which is tailored for regression and binary classification problems and uses the boosting approach for learning, which can be treated as generalization of sequential covering.

Generating Rule Sets from Model Trees

- Computer ScienceAustralian Joint Conference on Artificial Intelligence
- 1999

This paper presents an algorithm for inducing simple, accurate decision lists from model trees and shows that this method produces comparably accurate and smaller rule sets than the commercial state-of-the-art rule learning system Cubist.

Greedy function approximation: A gradient boosting machine.

- Computer Science
- 2001

A general gradient descent boosting paradigm is developed for additive expansions based on any fitting criterion, and specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification.

Benchmarking Open-Source Tree Learners in R/RWeka

- Computer ScienceGfKl
- 2007

Both classification tree algorithms are found to be competitive in terms of misclassification errorâ€”with the performance difference clearly varying across data sets, however, C4.5 tends to grow larger and thus more complex trees.

Classification and Regression by randomForest

- Computer Science
- 2007

random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.