Application Level Checkpoint-based Approach for Crush Failure in Distributed System

Abstract

Fault-tolerance is an important and critical issue in distributed and parallel processing system. Distributed system consists of a collection of interconnected stand-alone computers working together as a single, to produce complete result. If the numbers of computing nodes are increased concurrently and dynamically in distributed computing, it may have the… (More)

8 Figures and Tables

Topics

  • Presentations referencing similar topics