Anton Selikhov

Learn More
Global Computing platforms, large scale clusters and future TeraGRID systems gather thousands of nodes for computing parallel scientific applications. At this scale, node failures or disconnections are frequent events. This Volatility reduces the MTBF of the whole system in the range of hours or minutes.We present MPICH-V, an automatic Volatility tolerant(More)
  • 1