# Supersparse linear integer models for optimized medical scoring systems

@article{Ustun2015SupersparseLI, title={Supersparse linear integer models for optimized medical scoring systems}, author={Berk Ustun and Cynthia Rudin}, journal={Machine Learning}, year={2015}, volume={102}, pages={349-391} }

Scoring systems are linear classification models that only require users to add, subtract and multiply a few small numbers in order to make a prediction. These models are in widespread use by the medical community, but are difficult to learn from data because they need to be accurate and sparse, have coprime integer coefficients, and satisfy multiple operational constraints. We present a new method for creating data-driven scoring systems called a Supersparse Linear Integer Model (SLIM). SLIM…

## 238 Citations

### Interactivity and Transparency in Medical Risk Assessment with Supersparse Linear Integer Models

- Computer ScienceArXiv
- 2019

The technical architecture is described which allows a medical professional who is not specialised in developing and applying machine learning algorithms to create competitive transparent supersparse linear integer models in an interactive way to generate such scoring systems interactively.

### Interval Coded Scoring: a toolbox for interpretable scoring systems

- Computer SciencePeerJ Comput. Sci.
- 2018

The presented toolbox interface makes Interval Coded Scoring theory easily applicable to both small and large datasets, and allows end users to make a trade-off between complexity and performance based on cross-validation results and expert knowledge.

### Learning Optimized Risk Scores

- Computer ScienceJ. Mach. Learn. Res.
- 2019

A new machine learning approach to learn risk scores that can fit risk scores in a way that scales linearly in the number of samples, provides a certificate of optimality, and obeys real-world constraints without parameter tuning or post-processing is presented.

### Optimized Risk Scores

- Computer ScienceKDD
- 2017

This paper forms a principled approach to learn risk scores that are fully optimized for feature selection, integer coefficients, and operational constraints, and presents a new cutting plane algorithm to efficiently recover its optimal solution.

### A Provable Algorithm for Learning Interpretable Scoring Systems

- Computer ScienceAISTATS
- 2018

This work introduces an original methodology to simultaneously learn interpretable binning mapped to a class variable, and the weights associated with these bins contributing to the score, and develops and shows the theoretical guarantees for this method.

### The fused lasso penalty for learning interpretable medical scoring systems

- Computer Science2017 International Joint Conference on Neural Networks (IJCNN)
- 2017

An original methodology to simultaneously learn interpretable binning mapped to a class variable, and the weights associated with these bins contributing to the score is introduced.

### Group Probability-Weighted Tree Sums for Interpretable Modeling of Heterogeneous Data

- Computer ScienceArXiv
- 2022

An instance-weighted tree-sum method that effectively pools data across diverse groups to output a concise, rule-based model that achieves state-of-the-art prediction performance on important clinical datasets.

### A Bayesian Approach to Learning Scoring Systems

- Computer ScienceBig Data
- 2015

A Bayesian method for building scoring systems, which are linear models with coefficients that have very few significant digits, is presented, which achieves a high degree of interpretability of the models while maintaining competitive generalization performances.

### Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives

- Computer ScienceJ. Mach. Learn. Res.
- 2021

A discrete optimization based approach for learning sparse classifiers, where the outcome depends upon a linear combination of a small subset of features, which leads to models with considerably improved statistical performance when compared to competing toolkits.

### Learning customized and optimized lists of rules with mathematical programming

- Computer ScienceMath. Program. Comput.
- 2018

A mathematical programming approach to building rule lists, which are a type of interpretable, nonlinear, and logical machine learning classifier involving IF-THEN rules, which is useful for producing non-black-box predictive models, and has the benefit of a clear user-defined tradeoff between training accuracy and sparsity.

## References

SHOWING 1-10 OF 97 REFERENCES

### Classification and Disease Prediction Via Mathematical Programming

- Computer Science
- 2007

This chapter presents a effort of novel optimization‐based classification models that are general purpose and suitable for developing predictive rules for large heterogeneous biological and medical data sets.

### Mathematical programming formulations for two-group classification with binary variables

- Computer ScienceAnn. Oper. Res.
- 1997

The proposed nonparametric mixed integer programming (MIP) formulation for the binary variable classification problem not only has a geometric interpretation, but also is Bayes inspired, therefore, the proposed formulation possesses a strong probabilistic foundation.

### Sparse weighted voting classifier selection and its linear programming relaxations

- Computer ScienceInf. Process. Lett.
- 2012

### Risk group detection and survival function estimation for interval coded survival methods

- Computer ScienceNeurocomputing
- 2013

### A LASSO FOR HIERARCHICAL INTERACTIONS.

- Computer ScienceAnnals of statistics
- 2013

A precise characterization of the effect of this hierarchy constraint is given, a bound on this estimate reveals the amount of fitting "saved" by the hierarchy constraint, and it is proved that hierarchy holds with probability one.

### On Model Selection Consistency of Lasso

- Computer ScienceJ. Mach. Learn. Res.
- 2006

It is proved that a single condition, which is called the Irrepresentable Condition, is almost necessary and sufficient for Lasso to select the true model both in the classical fixed p setting and in the large p setting as the sample size n gets large.

### Least angle regression

- Computer Science
- 2004

A publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates is described.

### Derivation of a simple clinical model to categorize patients probability of pulmonary embolism: increasing the models utility with the SimpliRED D-dimer.

- MedicineThrombosis and haemostasis
- 2000

The combination of a score < or =4.0 by the authors' simple clinical prediction rule and a negative SimpliRED D-Dimer result may safely exclude PE in a large proportion of patients with suspected PE.

### Boosting Classifiers with Tightened L0-Relaxation Penalties

- Computer ScienceICML
- 2010

A new boosting algorithm based on linear programming with dynamic generation of variables and constraints is proposed which improves on current algorithms for weighted voting classification by striking a better balance between classification accuracy and the sparsity of the weight vector.

### Oscillation Heuristics for the Two-group Classification Problem

- MathematicsJ. Classif.
- 2004

A new nonparametric family of oscillation heuristics for improving linear classifiers in the two-group discriminant problem is proposed, motivated by the intuition that the classification accuracy of a separating hyperplane can be improved through small perturbations to its slope and position.