A Cluster Based Model for Resource Management and Fault Tolerance in Distributed Real-time Systems

Abstract

We propose a distributed system model based on clusters of processors and processes to manage the resources and provide fault tolerance in a distributed real-time system. At the lowest level, time is managed in a distributed clock synchronization module by a leader process that provides agreement on a common clock value among the representatives of the clusters which then dictate this value to the individual members of each cluster. For fault tolerance in such a system, process groups are used to manage replicates of processes using a similar concept where the ordering of the events is managed by the leader and representatives of process groups of each cluster which then dictate this ordering to every process group member in each cluster. Finally, the static load balancing is performed in two phases; rst the real-time tasks are allocated to clusters of processors and then scheduled within individual clusters to meet deadline constraints

Cite this paper

@inproceedings{Erciye2007ACB, title={A Cluster Based Model for Resource Management and Fault Tolerance in Distributed Real-time Systems}, author={Kayhan Erciye and Oznur Ozkasap and T. Tunali}, year={2007} }