Seongbok Baik

Learn More
System area networks have been developed to address the needs of computing clusters. Myricom's Myrinet architecture is one of the predominant technologies in this area. One of the key issues for SANs is fault-tolerant routing. Myrinet provides Mapper software to discover and maintain network topology. Myrinet's Mapper is centralized, susceptible to probe(More)
Fault management in high performance cluster networks has been focused on the notion of hard faults (i.e., link or node failures). Network degradations that negatively impact performance but do not result in failures often go unnoticed. In this paper, we classify such degradations as soft faults. In addition, we identify consistent performance as an(More)
— The acceleration in computational scale to solve problems in emerging " computational " fields from Nanoscience and Genetics to Astrophysics places increasingly heavy compute and data storage burdens on locally and globally distributed computer systems. We are focusing on the management of these loosely coupled systems (clusters and Grids) which are asked(More)
  • 1