Soon Myoung Chung

Learn More
Feature selection is an important method for improving the efficiency and accuracy of text categorization algorithms by removing redundant and irrelevant terms from the corpus. In this paper, we propose a new supervised feature selection method, named CHIR, which is based on the chi<sup>2</sup> statistic and new statistical data that can measure the(More)
In this paper, we propose a role-based access control (RBAC) method for grid database services in open grid services architecture-data access and integration (OGSA-DAI). OGSA-DAI is an efficient grid-enabled middleware implementation of interfaces and services to access and control data sources and sinks. However, in OGSA-DAI, access control causes(More)
In this paper, we propose a new algorithm, named MSPX, which mines maximal sequential patterns by using multiple samples to effectively exclude infrequent candidates. MSPX begins with a bottom-up search. But at each pass, instead of processing all candidates, it always tries to find most of the infrequent ones effectively by counting only the potentially(More)
In this paper, we propose an efficient scalable algorithm for mining Maximal Sequential Patterns using Sampling (MSPS). The MSPS algorithm reduces much more search space than other algorithms because both the subsequence infrequency-based pruning and the supersequence frequency-based pruning are applied. In MSPS, a sampling technique is used to identify(More)
In this paper, we propose a new parallel clustering algorithm, named Parallel Bisecting k-means with Prediction (PBKP), for message-passing multiprocessor systems. Bisecting k-means tends to produce clusters of similar sizes, and according to our experiments, it produces clusters with smaller entropy (i.e., purer clusters) than k-means does. Our PBKP(More)
In this paper, we propose a new algorithm named Parallel Multipass with Inverted Hashing and Pruning (PMIHP) for mining association rules between words in text databases. The characteristics of text databases are quite different from those of retail transaction databases, and existing mining algorithms cannot handle text databases efficiently because of the(More)
Skin cancer is the most common type of cancer in the United State. A large, shared skin cancer image database on the Internet will be quite valuable to the medical professionals and consumers. In this paper, a skin cancer image database is created using a three-tier system: a client application implemented in Java applets, a web server, and a backend(More)