William Hoarau

Learn More
In a network consisting of several thousands computers, the occurrence of faults is unavoidable. Being able to test the behavior of a distributed program in an environment where we can control the faults (such as the crash of a process) is an important feature that matters in the deployment of reliable programs. In this paper, we present FAIL (for FAult(More)
One of the topics of paramount importance in the development of Grid middleware is the impact of faults, since their probability of occurrence in a Grid infrastructure and in large-scale distributed systems is actually very high. In this paper, we explore the versatility of a new tool for fault injection in distributed applications: FAIL-FCI. In particular,(More)
One of the topics of paramount importance in the development of cluster and grid middleware is the impact of faults since their occurrence in grid infrastructures and in large-scale distributed systems is common. MPI (message passing interface) is a popular abstraction for programming distributed and parallel applications. FAIL (FAult Injection Language) is(More)
In a network consisting of several thousands computers, the occurrence of faults is unavoidable. Being able to test the behavior of a distributed program in an environment where we can control the faults (such as the crash of a process) is an important feature that matters in the deployment of reliable programs. In this paper, we extend FAIL-FCI (for Fault(More)
One important contribution to the community that is developing Grid middleware is the definition and implementation of benchmarks and tools to assess the performance and dependability of Grid applications and the corresponding middleware. In this paper, we present an experimental study that was conducted with OGSA-DAI, a popular package of middleware that(More)
In a network consisting of several thousands computers, the occurrence of faults is unavoidable. Being able to test the behaviour of a distributed program in an environment where we can control the faults (such as the crash of a process) is an important feature that matters in the deployment of reliable programs. In this paper, we investigate the(More)
In this paper we will present some work on dependability benchmarking for Grid Computing that represents a common view between two groups of Core-Grid: INRIA-Grand Large and University of Coimbra. We present a brief overview of the state of the art, followed by a presentation of the FAIL-FCI system from INRIA that provides a tool for fault-injection in(More)
In this paper we review several existing tools for fault injection and dependability benchmarking in grids. We emphasis on the FAIL-FCI fault-injection software that has been developed in INRIA Grand Large, and a benchmark tool called QUAKE that has been developed in the University of Coimbra. We present the state-of-the-art and we explain the importance of(More)
  • 1