Learn More
Large scale distributed systems like Grid gather several characteristics making them difficult to study only from theoretical models and simulators. Most of Grid deployed at large scale are production platforms making them inappropriate research tools because of their limited reconfig-uration, control and monitoring capabilities. In this paper , we present(More)
We tackle the problem of scheduling task graphs onto a heterogeneous set of machines, where each processor has a probability of failure governed by an exponential law. The goal is to design algorithms that optimize both makespan and reliability. First, we provide an optimal scheduling algorithm for independent unitary tasks where the objective is to(More)
MPI process placement can play a deterministic role concerning the application performance. This is especially true with nowadays architecture (heterogenous, multicore with different level of caches, etc.). In this paper, we will describe a novel algorithm called TreeMatch that maps processes to resources in order to reduce the communication cost of the(More)
—Scheduling stochastic workloads is a difficult task. In order to design efficient scheduling algorithms for such workloads, it is required to have a good in-depth knowledge of basic random scheduling strategies. This paper analyzes the distribution of sequential jobs and the system behavior in heterogeneous computational grid environments where the(More)
The GridRPC model [17] is an emerging standard promoted by the Global Grid Forum (GGF) † that defines how to perform remote client-server computations on a distributed architecture. In this model data are sent back to the client at the end of every computation. This implies unnecessary communications when computed data are needed by an other server in(More)
In this paper we tackle the problem of scheduling a periodic real-time system on identical multiprocessor platforms, moreover the tasks considered may fail with a given probability. For each task we compute its duplication rate in order to (1) given a maximum tolerated probability of failure, minimize the size of the platform such at least one replica of(More)
Quickly transmitting large datasets in the context of distributed computing on wide area networks can be achieved by compressing data before transmission. However, such an approach is not efficient when dealing with higher speed networks. Indeed, the time to compress a large file and to send it is greater than the time to send the uncompressed file. In this(More)