An experimental evaluation of boosting methods for classification.
@article{Stollhoff2010AnEE,
title={An experimental evaluation of boosting methods for classification.},
author={Rainer Stollhoff and Willi Sauerbrei and Martin Schumacher},
journal={Methods of information in medicine},
year={2010},
volume={49 3},
pages={
219-29
}
}OBJECTIVES
In clinical medicine, the accuracy achieved by classification rules is often not sufficient to justify their use in daily practice. In order to improve classifiers it has become popular to combine single classification rules into a classification ensemble. Two popular boosting methods will be compared with classical statistical approaches.
METHODS
Using data from a clinical study on the diagnosis of breast tumors and by simulation we will compare AdaBoost with gradient boosting…
14 Citations
Boosting for high-dimensional two-class prediction
- Computer ScienceBMC Bioinformatics
- 2015
Simulation studies and real gene-expression data sets show that boosting can substantially improve the performance of its base classifier also when data are high-dimensional, however, not all boosting algorithms perform equally well.
Multi-class HingeBoost. Method and application to the classification of cancer types using gene expression data.
- Computer ScienceMethods of information in medicine
- 2012
A new functional gradient descent boosting algorithm that directly extends the HingeBoost algorithm from the binary case to the multi-class case without reducing the original problem to multiple binary problems.
Improvement of adequate use of warfarin for the elderly using decision tree-based approaches.
- MedicineMethods of information in medicine
- 2014
Whether the effectiveness of using warfarin among elderly inpatients can be improved when machine learning techniques and data from the laboratory information system are incorporated is examined.
On novel approaches for classification. A proposal for an interdisciplinary debate.
- Computer ScienceMethods of information in medicine
- 2010
Standard statistics can be used to judge whether a novel classification scheme performs significantly better than the standard classifier, and if two different classification schemes are applied to the same data set, each subject can be judged to be correctly classified by each of the two classifiers.
The importance of knowing when to stop. A sequential stopping rule for component-wise gradient boosting.
- Computer ScienceMethods of information in medicine
- 2012
The newly developed sequential stopping rule improved purely AIC-based methods when used for the microarray-based prediction of the recurrence of metastases for stage II colon cancer patients and outperformed earlier approaches if applied to both simulated and real data.
Comparison of Machine Learning Classifiers to Predict Patient Survival and Genetics of GBM: Towards a Standardized Model for Clinical Implementation
- MedicineArXiv
- 2021
Nine machine learning classifiers, with different optimization parameters, are compared to predict overall survival (OS), isocitrate dehydrogenase mutation, O-6methylguanine-DNA-methyltransferase promoter methylation, epidermal growth factor receptor (EGFR) VII amplification and Ki-67 expression in GBM patients, based on radiomic features from conventional and advanced MR.
Regularization for generalized additive mixed models by likelihood-based boosting.
- Computer ScienceMethods of information in medicine
- 2012
The concept of boosting is extended to generalized additive mixed models and an appropriate algorithm is presented that uses two different approaches for the fitting procedure of the variance components of the random effects.
AI and High-Grade Glioma for Diagnosis and Outcome Prediction: Do All Machine Learning Models Perform Equally Well?
- MedicineFrontiers in Oncology
- 2021
This work aimed to compare ML classifiers to predict clinically relevant tasks for HGG: overall survival, isocitrate dehydrogenase (IDH) mutation, O-6-methylguanine-DNA-methyltransferase (MGMT) promoter methylation, epidermal growth factor receptor vIII (EGFR) amplification, and Ki-67 expression, based on radiomic features from conventional and advanced magnetic resonance imaging (MRI.
Supporting regenerative medicine by integrative dimensionality reduction.
- BiologyMethods of information in medicine
- 2012
The experimental results highlighted the main potentials of proposed approaches, including the ability to predict the true staging by combining multiple training data sets when this could not be inferred from a single data source, and to focus on a reduced list of genes of similar predictive performance.
Mining geriatric assessment data for in-patient fall prediction models and high-risk subgroups
- MedicineBMC Medical Informatics and Decision Making
- 2012
The results show that a little more than half of the fallers may be identified correctly by the model, but the positive predictive value is too low to be applicable, and high-risk subgroups are identified automatically from existing geriatric assessment data.
References
SHOWING 1-10 OF 71 REFERENCES
An empirical comparison of ensemble methods based on classification trees
- Computer Science
- 2003
An empirical comparison of the classification error of several ensemble methods based on classification trees is performed by using 14 data sets that are publicly available and that were used by Lim, Loh and Shih in 2000.
Special Invited Paper-Additive logistic regression: A statistical view of boosting
- Computer Science
- 2000
This work shows that this seemingly mysterious phenomenon of boosting can be understood in terms of well-known statistical principles, namely additive modeling and maximum likelihood, and develops more direct approximations and shows that they exhibit nearly identical results to boosting.
BagBoosting for tumor classification with gene expression data
- Computer ScienceBioinform.
- 2004
When bagging is used as a module in boosting, the resulting classifier consistently improves the predictive performance and the probability estimates of both bagging and boosting on real and simulated gene expression data.
Experiments with a New Boosting Algorithm
- Computer ScienceICML
- 1996
This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.
Using T3, an improved decision tree classifier, for mining stroke-related medical data.
- Computer ScienceMethods of information in medicine
- 2007
T3 is presented, a classification algorithm that builds decision trees of depth at most three, and results in high accuracy whilst keeping the tree size reasonably small, and demonstrates strong descriptive and predictive power without compromising simplicity and clarity.
Two models for outcome prediction - a comparison of logistic regression and neural networks.
- Computer ScienceMethods of information in medicine
- 2006
The conscientiously applied LR remains the gold standard for prognostic modelling; however, ANN can be an alternative automated "quick and easy" multivariate analysis.
Comparison of misclassification rates of search partition analysis and other classification methods.
- Computer ScienceStatistics in medicine
- 2006
The performance of SPAN is compared against the trials reported by Lim et al. of 33 other methods of classification, including tree, neural network and regression methods on 16 data sets, most of which were health related.
A Comparison of Nonparametric Error Rate Estimation Methods in Classification Problems
- Mathematics
- 2004
Assessment of the misclassification error rate is of high practical relevance in many biomedical applications. As it is a complex problem, theoretical results on estimator performance are few. The…
On the Bayes-risk consistency of regularized boosting methods
- Computer Science
- 2003
The main result of the paper is that certain regularized boosting algorithms provide Bayes-risk consistent classifiers under the sole assumption that the Bayes classifier may be approximated by a convex combination of the base classifiers.
Prediction error estimation: a comparison of resampling methods
- Computer ScienceBioinform.
- 2005
This work compares several methods for estimating the 'true' prediction error of a prediction model in the presence of feature selection, and finds that LOOCV and 10-fold CV have the smallest bias for linear discriminant analysis and the .632+ bootstrap has the lowest mean square error.






