Learn More
Decision tree is widely used in machine learning. In the process of constructing a tree, appropriate attributes have to be selected as nodes of the tree based on some criteria. There are several approaches to selection of attributes. In this paper, we present a new approach to selection of attributes for construction of decision tree based on rough set(More)
Evaluating classifier performances is a crucial problem in pattern recognition and machine learning. In this paper, we propose a new measure, i.e. confusion entropy, for evaluating classifiers. For each class cli of an ðN þ 1Þ-class problem, the misclassification information involves both the information of how the samples with true class label cli have(More)
Gene selection is a key problem in gene expression based cancer recognition and related tasks. A measure, called neighborhood mutual information (NMI), is introduced to evaluate the relevance between genes and related decision in this work. Then the measure is combined with the search strategy of minimal redundancy and maximal relevancy (mRMR) for(More)
Pruning decision trees is deemed an effective way of solving over-fitting in practice. Pruned decision trees usually have simpler structure and are expected to have higher generalization ability at the expense of classification accuracy. Nowadays, various pruning methods are available. However, the problem of how to make a trade-off between structural(More)
Cancer classification is the critical basis for patient-tailored therapy. Conventional histological analysis tends to be unreliable because different tumors may have similar appearance. The advances in microarray technology make individualized therapy possible. Various machine learning methods can be employed to classify cancer tissue samples based on(More)
For evaluating the classification model of an information system, a proper measure is usually needed to determine if the model is appropriate for dealing with the specific domain task. Though many performance measures have been proposed, few measures were specially defined for multi-class problems, which tend to be more complicated than two-class problems,(More)
Feature selection is very important to classification. In this paper, we propose to select features based on Bayes minimum error probability (SFBMEP). And we exploit the proposed method to sift possible functional genes for classifying cancers. The method dynamically evaluates all available genes and sifts only one gene at a time with the computation of(More)
Confusion entropy is a new measure for evaluating performance of classifiers. For each class in a classification problem, the CEN metric considers not only the misclassification information about how the true samples in this class have been misclassified to the other classes, but also the misclassification information about how the other samples have been(More)
Along with emergence of the high dimensionality of data, feature selection techniques are getting more significant to learning algorithms. Many metrics have been introduced in feature selection. Among them, mutual information is a highlighted one and has been developed during the past years. In this paper, a novel feature selection method based on the(More)