Leveraging user expertise in collaborative systems for annotating energy datasets
We present a new weighted voting classification ensemble method, called WAVE, that uses two weight vectors: a weight vector of classifiers and a weight vector of instances. The instance weight vector assigns higher weights to observations that are hard to classify. The weight vector of classifiers puts larger weights on classifiers that perform better on hard-to-classify instances. One weight vector is designed to be calculated in conjunction with the other through an iterative procedure. That is, the instances of higher weights play more important role in determining the weights of classifiers, and vice versa. We proved that the iterated weight vectors converge to the optimal weights which can be directly calculated from the performance matrix of classifiers in an ensemble. The final prediction of the ensemble is obtained by the voting using the optimal weight vector of classifiers. To compare the performance between a simple majority voting and the proposed weighted voting, we applied both of the voting methods to bootstrap aggregation and investigated the performance on 28 data sets. The result shows that the proposed weighted voting performs ∗Corresponding author, tel: +82 2 2123 2545, fax: +82-2-2123-8638 Email address: firstname.lastname@example.org (Hyunjoong Kim) Preprint submitted to Journal of the Korean Statistical Society March 2, 2011 significantly better than the simple majority voting in general.