Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.
@article{Harrell1996MultivariablePM,
title={Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.},
author={Frank E. Harrell and K L Lee and Daniel B. Mark},
journal={Statistics in medicine},
year={1996},
volume={15 4},
pages={
361-87
}
}Multivariable regression models are powerful tools that are used frequently in studies of clinical outcomes. These models can use a mixture of categorical and continuous variables and can handle partially observed (censored) responses. However, uncritical application of modelling techniques can result in models that poorly fit the dataset at hand, or, even more likely, inaccurately predict outcomes on new subjects. One must know how to measure qualities of a model's fit in order to avoid poorly…
5,422 Citations
Quantifying the impact of different approaches for handling continuous predictors on the performance of a prognostic model
- Environmental ScienceStatistics in medicine
- 2016
Three broad approaches for handling continuous predictors are examined, including various methods of categorising predictors, modelling a linear relationship between the predictor and outcome and modelling a nonlinear relationship using fractional polynomials or restricted cubic splines.
Validation measures for prognostic models for independent and correlated binary and survival outcomes
- Psychology
- 2012
Existing validation measures for independent data, such as the C-index, D statistic, calibration slope, Brier score, and the K statistic for use with random effects/frailty models are extended.
An evaluation of penalised survival methods for developing prognostic models with rare events
- Environmental ScienceStatistics in medicine
- 2012
Three existing penalised methods that have been proposed to improve predictive accuracy, including ridge, lasso and the garotte, are evaluated using simulated data derived from two clinical datasets and suggest that significant improvements are possible by taking a penalised modelling approach.
Prognostic Modeling with Logistic Regression Analysis
- BiologyMedical decision making : an international journal of the Society for Medical Decision Making
- 2001
A sensible strategy in small data sets is to apply shrinkage methods in full models that include well-coded predictors that are selected based on external information, such as full models including all available covariables.
Measures of discrimination and predictive accuracy for interval censored survival data
- Psychology
- 2015
Medical researchers frequently make statements that one model predicts survival better than another, and are frequently challenged to provide rigorous statistical justification for these statements.…
Internal validation of predictive models: efficiency of some procedures for logistic regression analysis.
- PsychologyJournal of clinical epidemiology
- 2001
Several methods to assess improvement in risk prediction models: Extension to survival analysis
- Environmental ScienceStatistics in medicine
- 2011
The primary parameters considered are net reclassification improvement (NRI) and integrated discrimination improvement (IDI) and a primary measure of concordance, area under the ROC curve (AUC), also called the c-statistic.
Risk assessment with newer statistical metrics
- MedicineStatistics in medicine
- 2017
The first Pencina article in this issue presents to assess the impact of calibration on the newer metrics of model performance, including Area under the Curve (AUC), discrimination slope, R-model, and R-residuals.
Assessing calibration of prognostic risk scores
- Computer ScienceStatistical methods in medical research
- 2016
A model-based framework for the assessment of calibration in the binary setting that provides natural extensions to the survival data setting and it is shown that Poisson regression models can be used to easily assess calibration in prognostic models.
Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets.
- MathematicsStatistics in medicine
- 2000
It is found that stepwise selection with a low alpha led to a relatively poor model performance, when evaluated on independent data, and shrinkage methods in full models including prespecified predictors and incorporation of external information are recommended, when prognostic models are constructed in small data sets.
References
SHOWING 1-10 OF 90 REFERENCES
Regression modelling strategies for improved prognostic prediction.
- PsychologyStatistics in medicine
- 1984
A general index of predictive discrimination is used to measure the ability of a model developed on training samples of varying sizes to predict survival in an independent test sample of patients suspected of having coronary artery disease.
A bootstrap resampling procedure for model building: application to the Cox regression model.
- MathematicsStatistics in medicine
- 1992
A bootstrap-model selection procedure is developed, combining the bootstrap method with existing selection techniques such as stepwise methods, for the selection of variables in the framework of a regression model which might influence the outcome variable.
Applied Logistic Regression
- Psychology
- 1989
Applied Logistic Regression, Third Edition provides an easily accessible introduction to the logistic regression model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables.
Measures of explained variation for survival data.
- BusinessStatistics in medicine
- 1990
The importance of quantifying the predictive power of a prognostic model is discussed, and measures of explained variation as a possible quantification are suggested.
Assessment of predictive models for binary outcomes: an empirical approach using operative death from cardiac surgery.
- MedicineStatistics in medicine
- 1994
Eight methodological strategies for creating predictive models are compared in a large, complex data base consisting of preoperative risk and operative outcome data on 12,712 patients undergoing coronary artery bypass grafting and entered into the Department of Veterans Affairs Cardiac Surgery Risk Assessment Program between April 1987 and March 1990.
Flexible Methods for Analyzing Survival Data Using Splines, with Applications to Breast Cancer Prognosis
- Mathematics
- 1992
In an analysis of a large data set taken from clinical trials conducted by the Eastern Cooperative Oncology Group, these methods are seen to give useful insight into how prognosis varies as a function of continuous covariates, and also into how covariate effects change with follow-up time.
Prediction error and its estimation for subset-selected models
- Psychology
- 1991
Strategies are compared for development of a linear regression model and the subsequent assessment of its predictive ability. Simulations were performed as a designed experiment over a range of data…
Proportional hazards tests and diagnostics based on weighted residuals
- Mathematics
- 1994
SUMMARY Nonproportional hazards can often be expressed by extending the Cox model to include time varying coefficients; e.g., for a single covariate, the hazard function for subject i is modelled as…
Regression Splines in the Cox Model with Application to Covariate Effects in Liver Disease
- Mathematics
- 1990
Abstract The Cox proportional hazards model restricts the log hazard ratio to be linear in the covariates. A smooth nonlinear covariate effect may go undetected in this model but can be well…
Predictive value of statistical models.
- Computer ScienceStatistics in medicine
- 1990
A review is given of different ways of estimating the error rate of a prediction rule based on a statistical model and how cross-validation can be used to obtain an adjusted predictor with smaller error rate.