Uncertainty in Lung Cancer Stage for Outcome Estimation via Set-Valued Classification
@inproceedings{Bergquist2021UncertaintyIL, title={Uncertainty in Lung Cancer Stage for Outcome Estimation via Set-Valued Classification}, author={Savannah L. Bergquist and Gabriel A. Brooks and Mary Beth Landrum and Nancy L. Keating and Sherri Rose}, year={2021} }
Difficulty in identifying cancer stage in health care claims data has limited oncology quality of care and health outcomes research. We fit prediction algorithms for classifying lung cancer stage into three classes (stages I/II, stage III, and stage IV) using claims data, and then demonstrate a method for incorporating the classification uncertainty in outcomes estimation. Leveraging set-valued classification and split conformal inference, we show how a fixed algorithm developed in one cohort…Â
Figures and Tables from this paper
References
SHOWING 1-10 OF 27 REFERENCES
Classifying Stage IV Lung Cancer From Health Care Claims: A Comparison of Multiple Analytic Approaches.
- MedicineJCO clinical cancer informatics
- 2019
Machine learning algorithms have potential to improve lung cancer stage classification but may be prone to overfitting, and degradation of accuracy between development and validation cohorts suggests the need for caution in implementing machine learning in research or care delivery.
Development of predictive models to identify advanced-stage cancer patients in a US healthcare claims database.
- MedicineCancer epidemiology
- 2019
Detecting Lung and Colorectal Cancer Recurrence Using Structured Clinical/Administrative Data to Enable Outcomes Research and Population Health Management
- MedicineMedical care
- 2017
Algorithms to detect the presence and timing of recurrence after definitive therapy for stages I–III lung and colorectal cancer using 2 data sources that contain a widely available type of structured data linked to gold-standard recurrence status are developed.
Uncertainty estimation for classification and risk prediction on medical tabular data
- Computer Science
- 2020
This work expands and refine the set of heuristics to select an uncertainty estimation technique and observes that ensembles and related techniques perform poorly when it comes to detecting out-of-domain examples, a critical task which is carried out more successfully by auto-encoders.
Updated Overview of the SEER-Medicare Data: Enhanced Content and Applications.
- Medicine, Political ScienceJournal of the National Cancer Institute. Monographs
- 2020
The large sample size and diverse array of data on cancer patients and noncancer controls in the SEER-Medicare database make it a unique resource for conducting cancer health services research.
Development, Validation, and Dissemination of a Breast Cancer Recurrence Detection and Timing Informatics Algorithm
- MedicineJournal of the National Cancer Institute
- 2018
Valid and reliable detection of recurrence using data derived from electronic medical records and insurance claims is feasible and will enable extensive, novel research on quality, effectiveness, and outcomes for breast cancer patients and those who develop recurrence.
Survival ensembles.
- Computer ScienceBiostatistics
- 2006
A unified and flexible framework for ensemble learning in the presence of censoring for right-censored data is proposed and a random forest algorithm and a generic gradient boosting algorithm are introduced for the construction of prognostic and diagnostic models.
Super Learner for Survival Data Prediction
- Computer ScienceThe international journal of biostatistics
- 2020
This paper proposes two algorithms for constructing super learners in survival data prediction where the individual algorithms are based on proportional hazards and compares the performance of the proposed super learners with existing models through extensive simulation studies.
Random survival forests
- Computer Science
- 2008
This article introduces random survival forests, a random forests method for the analysis of right-censored survival data, and extends Breiman’s random forests (RF) method, showing it to be highly accurate and comparable to state-of-the-art methods.
Post-prediction Inference
- Computer SciencebioRxiv
- 2020
The postpi approach can correct bias and improve variance estimation (and thus subsequent statistical inference) with predicted outcome data and can improve inference in two totally distinct fields: modeling predicted phenotypes in re-purposed gene expression data and modeling predicted causes of death in verbal autopsy data.