Analysis of feature selection stability on high dimension and small sample data
Classification problems in high dimensional data with small number of observations are becoming more common particularly in microarray data. Throughout the last two decades, plenty of efficient categorization models and feature selection (FS) algorithms have been planned for high prediction accuracies. The optimal Linear Programming Boosting (LPBoost) is a supervise classifier since the boosting family of classifiers. To predict or the feature selection (FS) algorithm applied is not efficient with the accurate data set. The LP Boost maximizes a margin between training samples of dissimilar classes and therefore also belongs to the class of margin-maximizing supervised classification algorithms. Therefore, Booster can also be used as a criterion to estimate the act of an FS algorithm or to estimate the complexity of a data set for classification. LPBoost iteratively optimizes double misclassification costs and vigorously generates pathetic hypotheses to build new LP columns.