Learn More
Existing parallel mining algorithms for frequent itemsets lack a mechanism that enables automatic parallelization, load balancing, data distribution, and fault tolerance on large clusters. As a solution to this problem, we design a parallel frequent itemsets mining algorithm called FiDoop using the MapReduce programming model. To achieve compressed storage(More)
Association rule mining, in which generating frequent patterns is a key step, is an effective way of identifying inherent and unknown interrelationships between characteristics of celestial spectra data and its physicochemical properties. In this study, we first make use of the first-order predicate logic to represent knowledge derived from celestial(More)
An algorithm of 3D model watermarking which has an effective resistance to the affine transform is designed. The algorithm uses high-capacity gray image as the watermark information, and chooses Nielson norm combining with redundancies of triangle meshes. Experimental results show that the algorithm can effectively resist attacks such as translation,(More)
Decision tree is an effective way of classification rule extraction. For the incomplete decision system, an algorithm of constructing decision tree and classification rule extraction is presented based on the logical relationship between attributes which is described with generalized decision function, and conditional information entropy as heuristic. In(More)
Seeking the unknown celestial body is one of the profound goals for the mankind explores pursued universe. Outlier mining is a kind of effective way of finding the spectrum data of unknown celestial body. Using outlier mining as the way of analyzing star spectrum data and VC++, Oracle9i as development tools, the outlier mining system on star spectrum data(More)
Traditional parallel algorithms for mining frequent itemsets aim to balance load by equally partitioning data among a group of computing nodes. We start this study by discovering a serious performance problem of the existing parallel Frequent Itemset Mining algorithms. Given a large dataset, data partitioning strategies in the existing solutions suffer high(More)