Kazuto Kubota

Learn More
We developed a compile-time metalevel architecture in C ++ , called the MPC ++ metalevel architecture, to not only extend and modify language semantics, but also extend syntax. This architecture overcomes the imperative languages' issue of compile-time metalevel processing. The proposed metalevel architecture has been implemented and tested. A typical(More)
We have built an eight node SMP cluster called COMPaS (Cluster Of Multi-Processor Systems), each node of which is a quadprocessor Pentium Pro PC. We have designed and implemented a remote memory based user-level communication layer which provides lowoverhead and high bandwidth using Myrinet. We designed a hybrid programming model in order to take advantage(More)
NICAM is a communication layer for SMP PC clusters connected via Myrinet, designed to reduce overhead and latency by directly utilizing a micro-processor equipped on the network interface. It adopts remote memory operations to reduce much of the overhead found in message passing. NICAM employs an Active Messages framework for exibility in programming on the(More)
Matrix clustering is a new data mining method which extracts a dense sub-matrix from a large sparse binary matrix. We propose an e cient algorithm named the ping-pong algorithm which enables real-time mining of a large sparse matrix. This article describes the application of matrix clustering to Web usage mining. Matrix clustering can be applied to Web(More)
A new gridless router accelerated by Content Addressable Memory (CAM) is presented. A gridless version of the line-expansion algorithm is implemented, which always finds a path if one exists. The router runs in linear time by means of the CAM-based accelerator. Experimental results show that the more obstacles there are in the routing region, the more(More)
A simulation technique for very large-scale data parallel programs is proposed. In our simulation method, a data parallel program is divided into computation and communication sections. When the control ow of the parallel program does not depend on the contents of network messages, the computation time on each processor is calculated independently. An(More)
This paper proposes a parallel data-mining algorithm and its implementation on a PC cluster. The decision tree is a widely used data-mining algorithm for classifying records in a database. Simple parallelization of decision tree generation is not efficient because of the load imbalance caused by the form of the generated tree. The SPRINT algorithm solves(More)
In this paper, we measure and compare the performance of sharedand distributed-memory multiprocessors using a parallel tree search problem to characterize these types of multiprocessors. We take the knapsack problem using the branch-and-bound algorithm as our workload. It is di cult to compare the performance using irregular parallel problems such as tree(More)
Using workstation clusters (WSCs) is a practical and cost-e ective means of parallel computing. Fast workstations, using the latest RISC technology, can be assigned as computation nodes, however, communication between nodes is rather slow when using Ethernet. Recently, fast communication adaptors[2] have been developed, and the speed problem is being(More)
  • 1