Cluster analysis of Wisconsin Breast Cancer dataset using self-organizing maps.
This paper addresses breast cancer diagnosis problem as a pattern classification problem. Specifically, the problem is studied using Wisconsin-Madison breast cancer data set. Fuzzy rules are generated from the input-output relationship so that the diagnosis becomes easier and transparent for both patients and physicians. For each class, at least one training pattern is chosen as the prototype, provided (a) the maximum membership of the training pattern is in the given class, and (b) among all the training patterns, the neighborhood of this training pattern has the least fuzzy-rough uncertainty in the given class. Using the fuzzy-rough uncertainty, a cluster is constructed around each prototype. Finally, these clusters are interpreted as the fuzzy rules that relate the prognostic factors and the diagnosis results. The advantages of the proposed algorithm are, (a) there is no need to know the structure of the training data, (b) the number of fuzzy rules does not increase with the increase of the number of input dimensions, and (c) small number of fuzzy rules is generated. With the three generated fuzzy rules, 96.20% classification efficiency is achieved, which is comparable to other rule generation techniques.