Flaviu Cristian

Learn More
ion. I f a server d e p e n d s on lower-level servers to correct ly provide its service, then a fa i lure o f a cer ta in type at a lower level o f abstraction can result in a fa i lure o f a d i f f e ren t type at the h ighe r level o f abstraction. For example , cons iderion. For example , cons ider a value fai lure at the physical transmission layer o(More)
A probabilistic method is proposed for reading remote clocks in distributed systems subject to unbounded random communication delays. The method can achieve clock synchronization precisions superior to those attainable by previously published clock synchronization algorithms. Its use is illustrated by presenting a time service which maintains externally(More)
In loosely coupled distributed systems subject to random communication delays and component failures, atomic brocrdcart protocols can be used to implement the abstraction of a A-common sfomge, a replicated storage that displays at any clock time the same contents to every correct processor and that requires A time units to complete replicated updates. We(More)
Reaching agreement on the identity of correctly functioning processors of a distributed system in the presence of random communication delays, failures and processor joins is a fundamental problem in fault-tolerant distributed systems. Assuming a synchronous communication network that is not subject to partition occurrences, we specify the processor-group(More)
The rst part of this paper provides rigorous deenitions for several basic concepts underlying the design of dependable programs, such as speciication, program semantics, exception, program correctness, robustness, failure, fault, and error. The second part investigates what it means to handle exceptions in modular programs structured as hierarchies of data(More)
We introduce the timed asynchronous distributed system model to describe existing asynchronous distributed systems subject to unbounded processing and communication delays, failures and recoveries. We then describe ve increasingly strong speci cations for processor-group membership services in timed asynchronous systems subject to partitioning. We also(More)
We propose a synchronous atomic broadcast protocol for distributed real-time systems based on redundant broadcast channels. The protocol can tolerate a finite number f of concurrent processor crash failures, channel adapter performance failures and channel omission failures. Its message cost is optimal: when no failures occur only f+1 messages are sent per(More)
The first part of this chapter provides rigorous definitions for several basic concepts underlying the design of dependable programs, such as specification, program semantics, exception, program correctness, robustness, failure, fault, and error. The second part investigates what it means to handle exceptions in modular programs structured as hierarchies of(More)
bounded responses with a certain probability. This article emphasized similarities between synchronous and asynchronous programming by discussing only strict agreement-the kind of asynchronous agreement closest to synchronous agreement. In reality, the field of asynchronous group communication is vaster-strict agreement being one extreme where all replicas(More)