• Corpus ID: 219179781

Performance metrics for intervention-triggering prediction models do not reflect an expected reduction in outcomes from using the model

  title={Performance metrics for intervention-triggering prediction models do not reflect an expected reduction in outcomes from using the model},
  author={Alejandro Schuler and Aashish Bhardwaj and Vincent X. Liu},
Clinical researchers often select among and evaluate risk prediction models using standard machine learning metrics based on confusion matrices. However, if these models are used to allocate interventions to patients, standard metrics calculated from retrospective data are only related to model utility (in terms of reductions in outcomes) under certain assumptions. When predictions are delivered repeatedly throughout time (e.g. in a patient encounter), the relationship between standard metrics… 

Tables from this paper


Decision Curve Analysis: A Novel Method for Evaluating Prediction Models
  • A. Vickers, E. Elkin
  • Psychology
    Medical decision making : an international journal of the Society for Medical Decision Making
  • 2006
Decision curve analysis is a suitable method for evaluating alternative diagnostic and prognostic strategies that has advantages over other commonly used measures and techniques.
Using relative utility curves to evaluate risk prediction
The relative utility curve which gauges the potential for better prediction in terms of utilities, without the need for a reference level for one utility, while providing a sensitivity analysis for missipecification of utilities is made.
Decision Analysis for the Evaluation of Diagnostic Tests, Prediction Models, and Molecular Markers
Decision analytic methods can provide insight into the consequences of using a test, model, or marker in clinical practice and additional data from the literature, or subjective assessments from individual patients or clinicians are needed in order to assign weights appropriately.
Optimal intensive care outcome prediction over time using machine learning
The assessment of prognosis over more than one day may be a valuable strategy as new information on the second day helps to differentiate outcomes, and new ML models based on trend data beyond the first day could greatly improve upon current risk stratification tools.
Using Electronic Health Record Data to Develop and Validate a Prediction Model for Adverse Outcomes in the Wards*
A prediction tool for ward patients that can simultaneously predict the risk of cardiac arrest and ICU transfer and was more accurate than the VitalPAC Early Warning Score and could be implemented in the electronic health record to alert caregivers with real-time information regarding patient deterioration.
Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests
Net benefit is useful for determining whether basing clinical decisions on a model, marker, or test would do more good than harm, in contrast to traditional measures such as sensitivity, specificity, or area under the curve, which are statistical abstractions not directly informative about clinical value.
Prediction of Sepsis in the Intensive Care Unit With Minimal Electronic Health Record Data: A Machine Learning Approach
InSight, a machine learning classification system that uses multivariable combinations of easily obtained patient data, is an effective tool for predicting sepsis onset and performs well even with randomly missing data.
Development of a Multicenter Ward-Based AKI Prediction Model.
Readily available electronic health record data can be used to improve AKI risk stratification with good to excellent accuracy and real time use of Electronic Signal to Prevent AKI would allow early interventions before changes in serum creatinine and may improve costs and outcomes.
DeepSOFA: A Continuous Acuity Score for Critically Ill Patients using Clinically Interpretable Deep Learning
A novel acuity score framework (DeepSOFA) is proposed that leverages temporal measurements and interpretable deep learning models to assess illness severity at any point during an ICU stay and yields significantly more accurate predictions of in-hospital mortality.