Hiroshi Motoda

Learn More
This paper proposes a novel approach named AGM to e ciently mine the association rules among the frequently appearing substructures in a given graph data set. A graph transaction is represented by an adjacency matrix, and the frequent patterns appearing in the matrices are mined through the extended algorithm of the basket analysis. Its performance has been(More)
This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining algorithms in the research community. With each algorithm, we provide(More)
A flash memory device driver that supports a conventional UNIX file system transparently was designed. To avoid the limitations due to flash memory's restricted number of write cycles and its inability to be overwritten, this driver writes data to the flash memory system sequentially as a Log-structured File System (LFS) does and uses a cleaner to collect(More)
Basket Analysis, which is a standard method for data mining, derives frequent itemsets from database. However, its mining ability is limited to transaction data consisting of items. In reality, there are many applications where data are described in a more structural way, e.g. chemical compounds and Web browsing history. There are a few approaches that can(More)
The need for mining structured data has increased in the past few years. One of the best studied data structures in computer science and discrete mathematics are graphs. It can therefore be no surprise that graph based data mining has become quite popular in the last few years.This article introduces the theoretical basis of graph based data mining and(More)
Feature selection is an e ective technique in dealing with dimensionality reduction for classi cation task a main component of data mining It searches for an optimal subset of features The search strategies under consideration are one of the three complete heuristic and probabilistic Existing algorithms adopt various measures to evaluate the goodness of(More)
1. Background and motivation The digital technologies and computer advances with the booming internet uses have led to massive data collection (corporate data, data warehouses, webs, just to name a few) and information (or misinformation) explosion. Szalay and Gray described this phenomenon as " drowning in data " (Szalay and Gray, 1999). They reported that(More)
The rapid advance of computer technologies in data processing, collection, and storage has provided unparalleled opportunities to expand capabilities in production, services, communications, and research. However, immense quantities of high-dimensional data renew the challenges to the state-of-the-art data mining techniques. Feature selection is an(More)
We address the problem of estimating the parameters for a continuous time delay independent cascade (CTIC) model, a more realistic model for information diffusion in complex social network, from the observed information diffusion data. For this purpose we formulate the rigorous likelihood to obtain the observed data and propose an iterative method to obtain(More)