William Hoarau

Learn More
In a network consisting of several thousands computers, the occurrence of faults is unavoidable. Being able to test the behavior of a distributed program in an environment where we can control the faults (such as the crash of a process) is an important feature that matters in the deployment of reliable programs. In this paper, we present FAIL (for FAult(More)
One of the topics of paramount importance in the development of Grid middleware is the impact of faults, since their probability of occurrence in a Grid infrastructure and in large-scale distributed systems is actually very high. In this paper, we explore the versatility of a new tool for fault injection in distributed applications: FAIL-FCI. In particular,(More)
In a network consisting of several thousands computers, the occurrence of faults is unavoidable. Being able to test the behaviour of a distributed program in an environment where we can control the faults (such as the crash of a process) is an important feature that matters in the deployment of reliable programs. In this paper, we investigate the(More)
One of the topics of paramount importance in the development of Cluster and Grid middleware is the impact of faults since their occurrence in Grid infrastructures and in large-scale distributed systems is common. MPI (Message Passing Interface) is a popular abstraction for programming distributed and parallel applications. FAIL (FAult Injection Language) is(More)
  • 1