Do we need hundreds of classifiers to solve real world classification problems

@article{FernndezDelgado2014DoWN,
  title={Do we need hundreds of classifiers to solve real world classification problems},
  author={Manuel Fern{\'a}ndez-Delgado and Eva Cernadas and Sen{\'e}n Barro and Dinani Gomes Amorim},
  journal={Journal of Machine Learning Research},
  year={2014},
  volume={15},
  pages={3133-3181}
}
We evaluate 179 classifiers arising from 17 families (discriminant analysis, Bayesian, neural networks, support vector machines, decision trees, rule-based classifiers, boosting, bagging, stacking, random forests and other ensembles, generalized linear models, nearest-neighbors, partial least squares and principal component regression, logistic and multinomial regression, multiple adaptive regression splines and other methods), implemented in Weka, R (with and without the caret package), C and… 
Comparison of 14 different families of classification algorithms on 115 binary datasets
TLDR
This work tested 14 very different classification algorithms on 115 real life binary datasets and found that the three best classifiers are not significantly different from each other, and used a Bayesian ANOVA analysis to conclude that with high probability the differences between these three classifiers is not of practical consequence.
Is rotation forest the best classifier for problems with continuous features?
TLDR
It is demonstrated that on large problems rotation forest can be made an order of magnitude faster without significant loss of accuracy, and it is maintained that without any domain knowledge to indicate an algorithm preference, rotation forest should be the default algorithm of choice for problems with continuous attributes.
Large dimensional analysis of general margin based classification methods
  • Hanwen Huang
  • Computer Science
    Journal of Statistical Mechanics: Theory and Experiment
  • 2021
TLDR
Analysis of the asymptotic performance of a family of large-margin classifiers under the two component mixture models in situations where the data dimension p and the sample n are both large sheds new light on how to select the best classifier among various classification methods as well as how to choose the optimal tuning parameters for a given method.
Generalized Linear Splitting Rules in Decision Forests
TLDR
This work establishes a generalized decision forest framework called Randomer Forests (RerFs), which encompasses RFs and many previously proposed decision forest algorithms as particular instantiations and proposes a default instantiation and provides theoretical and experimental evidence motivating its use.
Trees, forests, and impurity-based variable importance in regression
TLDR
This paper analyzes MDI and proves that if input variables are independent and in absence of interactions, MDI provides a variance decomposition of the output, where the contribution of each variable is clearly identified.
Enhancing Multi-Class Classification of Random Forest using Random Vector Functional Neural Network and Oblique Decision Surfaces
TLDR
The proposed method provides a rich insight into the data by grouping the confusing or hard to classify samples for each class and thus, provides an opportunity to employ fine-grained classification rule over the data.
The Heterogeneous Ensembles of Standard Classification Algorithms (HESCA): the Whole is Greater than the Sum of its Parts
TLDR
HESCA is a fast approach that is, on average, as good as state-of-the-art classifiers, whereas HESCA+, which contains a deep neural network, a support vector machine and two decision tree forests, is significantly better than its components, picking the best component, and HESca.
Weighted Scoring in Geometric Space for Decision Tree Ensemble
TLDR
The proposed algorithm that uses horizontal partitioning the learning set and uses decision trees as base models to obtain decision regions is proved that for infinitely dense space division proposed algorithm is equivalent to majority voting.
...
...

References

SHOWING 1-10 OF 116 REFERENCES
Logistic Model Trees
TLDR
This paper uses a stagewise fitting process to construct the logistic regression models that can select relevant attributes in the data in a natural way, and shows how this approach can be used to build the logistics regression models at the leaves by incrementally refining those constructed at higher levels in the tree.
The Power of Decision Tables
TLDR
Experimental results show that on artificial and real-world domains containing only discrete features, IDTM, an algorithm inducing decision tables, can sometimes outperform state-of-the-art algorithms such as C4.5.
Extreme Learning Machine for Regression and Multiclass Classification
TLDR
ELM provides a unified learning platform with a widespread type of feature mappings and can be applied in regression and multiclass classification applications directly and in theory, ELM can approximate any target continuous function and classify any disjoint regions.
Special Invited Paper-Additive logistic regression: A statistical view of boosting
TLDR
This work shows that this seemingly mysterious phenomenon of boosting can be understood in terms of well-known statistical principles, namely additive modeling and maximum likelihood, and develops more direct approximations and shows that they exhibit nearly identical results to boosting.
MetaCost: a general method for making classifiers cost-sensitive
TLDR
A principled method for making an arbitrary classifier cost-sensitive by wrapping a cost-minimizing procedure around it is proposed, called MetaCost, which treats the underlying classifier as a black box, requiring no knowledge of its functioning or change to it.
Penalized classification using Fisher's linear discriminant
  • D. Witten, R. Tibshirani
  • Computer Science
    Journal of the Royal Statistical Society. Series B, Statistical methodology
  • 2011
TLDR
This work proposes penalized LDA, which is a general approach for penalizing the discriminant vectors in Fisher's discriminant problem in a way that leads to greater interpretability, and uses a minorization–maximization approach to optimize it efficiently when convex penalties are applied to the discriminating vectors.
Experiments with a New Boosting Algorithm
TLDR
This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.
Random Forests
TLDR
Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
The Alternating Decision Tree Learning Algorithm
TLDR
A new type of classi cation rule, the alternating decision tree, which is a generalization of decision trees, voted decision trees and voted decision stumps and generates rules that are usually smaller in size and thus easier to interpret.
Classification and regression trees
  • W. Loh
  • Computer Science
    WIREs Data Mining Knowl. Discov.
  • 2011
TLDR
This article gives an introduction to the subject of classification and regression trees by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples.
...
...