A multiple testing framework for diagnostic accuracy studies with co-primary endpoints.

@article{Westphal2022AMT,
  title={A multiple testing framework for diagnostic accuracy studies with co-primary endpoints.},
  author={Max Westphal and Antonia Zapf and Werner Brannath},
  journal={Statistics in medicine},
  year={2022}
}
Major advances have been made regarding the utilization of machine learning techniques for disease diagnosis and prognosis based on complex and high-dimensional data. Despite all justified enthusiasm, overoptimistic assessments of predictive performance are still common in this area. However, predictive models and medical devices based on such models should undergo a throughout evaluation before being implemented into clinical practice. In this work, we propose a multiple testing framework for… 

Figures and Tables from this paper

Statistical Inference for Diagnostic Test Accuracy Studies with Multiple Comparisons
Diagnostic accuracy studies assess sensitivity and specificity of a new index test in relation to an established comparator or the reference standard. The development and selection of the index test
Improving Model Selection by Employing the Test Data
TLDR
This work investigates the properties of novel evaluation strategies, namely when the final model is selected based on empirical performances on the test data, and improves model selection in terms of the expected final model performance without introducing overoptimism.

References

SHOWING 1-10 OF 55 REFERENCES
Evaluation of multiple prediction models: A novel view on model selection and performance assessment
TLDR
It is concluded that evaluating only a single final model is suboptimal and several promising models should be evaluated simultaneously, e.g. all models within one standard error of the best validation model.
Improving Model Selection by Employing the Test Data
TLDR
This work investigates the properties of novel evaluation strategies, namely when the final model is selected based on empirical performances on the test data, and improves model selection in terms of the expected final model performance without introducing overoptimism.
Artificial intelligence in healthcare: past, present and future
TLDR
The current status of AI applications in healthcare, in the three major areas of early detection and diagnosis, treatment, as well as outcome prediction and prognosis evaluation, are surveyed and its future is discussed.
Deep learning for healthcare: review, opportunities and challenges
TLDR
It is suggested that deep learning approaches could be the vehicle for translating big biomedical data into improved human health and develop holistic and meaningful interpretable architectures to bridge deep learning models and human interpretability.
Opportunities and obstacles for deep learning in biology and medicine
TLDR
It is found that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art.
Optimal classifier selection and negative bias in error rate estimation: an empirical study on high-dimensional prediction
TLDR
The strategy to present only the optimal result is not acceptable because it yields a substantial bias in error rate estimation, and alternative approaches for properly reporting classification accuracy are suggested.
Sources of Variation and Bias in Studies of Diagnostic Accuracy
TLDR
A systematic review of all studies in which the main focus was examine the effects of one or more sources of bias or variation on estimates of test performance, to classify the various sources of variation and bias, describe their effects on test results, and provide a summary of the available evidence that supports each source of bias and variation.
Pivotal Evaluation of the Accuracy of a Biomarker Used for Classification or Prediction: Standards for Study Design
TLDR
A nested case–control study design that involves prospective collection of specimens before outcome ascertainment from a study cohort that is relevant to the clinical application and common biases that pervade the biomarker research literature would be eliminated if rigorous standards were followed.
Improved Automated Detection of Diabetic Retinopathy on a Publicly Available Dataset Through Integration of Deep Learning.
TLDR
A deep-learning enhanced algorithm for the automated detection of DR, achieves significantly better performance than a previously reported, otherwise essentially identical, algorithm that does not employ deep learning.
Artificial intelligence as a medical device in radiology: ethical and regulatory issues in Europe and the United States
TLDR
The legal framework regulating medical devices and data protection in Europe and in the United States is analyzed, assessing developments that are currently taking place and issues of accountability, both legal and ethical are considered.
...
1
2
3
4
5
...