—The Knowledge Grid needs to operate with a scalable platform to provide large-scale intelligent services. A key function of such a platform is to efficiently support various complex queries in a dynamic large-scale network environment. This paper proposes a platform to support index-based path queries by incorporating a semantic overlay with an underlying… (More)
—This paper presents Harmonic Ring (HRing), a structured peer-to-peer (P2P) overlay where long links are built along the ring with decreasing probabilities coinciding with the Harmonic Series. HRing constructs routing tables based on the distance between node positions instead of node IDs in order to eliminate the effect of node ID distribution on the long… (More)
The resource space model (RSM) is a semantic data model based on orthogonal classification semantics for efficiently managing various resources in the future interconnection environment. This paper extends the RSM in theory by formalizing the resource space, investigating its characteristics from the perspective of set theory, defining the resource space… (More)
Fault tolerance overhead of high performance computing (HPC) applications is becoming critical to the efficient utilization of HPC systems at large scale. HPC applications typically tolerate fail-stop failures by checkpointing. Another promising method is in the algorithm level, called algorithmic recovery. These two methods can achieve high efficiency when… (More)
With the growing scale of High Performance Computing (HPC) systems, faults are a norm rather than an exception Exascale systems are projected to fail every 3~26 minutes[Schroeder and Gibson].
For graph traversal applications, fine synchronization is required to exploit massive fine parallelism. However, in the conventional solution using fine-grained locks, locks themselves suffer huge memory cost as well as poor locality for inherent irregular access to vertices. In this paper, we propose a novel fine lock solution-vLock. The key idea is lock… (More)
—With the growing scale of high-performance computing (HPC) systems, today and more so tomorrow, faults are a norm rather than an exception. HPC applications typically tolerate fail-stop failures under the stop-and-wait scheme, where even if only one processor fails, the whole system has to stop and wait for the recovery of the corrupted data. It is now a… (More)