Fabrizio Grandoni

Learn More
ÐThis paper presents a class of count-and-threshold mechanisms, collectively named -count, which are able to discriminate between transient faults and intermittent faults in computing systems. For many years, commercial systems have been using transient fault discrimination via threshold-based techniques. We aim to contribute to the utility of(More)
In this paper the consolidate identification of faults, distinguished as transient or permanent/intermittent, is approached. Transient faults discrimination has long been performed in commercial systems: threshold-based techniques have been practice for several years for this purpose. The present work aims to contribute to the usefulness of the(More)
Mechanisms for restoring the state of a channel in an N-modular redundant architecture are necessary to prevent redundancy attrition due to transient faults and to allow failed channels to be brought back on line after repair. This paper considers software-implemented mechanisms for state restoration (SR) in a generic faulttolerant architecture in which(More)
Effective discrimination between transient and permanent faults is a very important practical problem in (dependable) system design. A count-andthreshold mechanism named α-count, designed to discriminate between transient faults and intermittent faults in computing systems, is presented in an enhanced embodiment. It retains enough simplicity to allow(More)
This paper deals with multiprocessor systems required to provide both high performance and good figures of dependability attributes. Fault tolerance is pursued through a proper combination and integration of a diagnostic mechanism, called α count, with simple instances of redundancy-based error processing structures. The resulting fault tolerance strategies(More)
The structure of the Processing Element (PE), which is the basic component of SMA<supscrpt>1</supscrpt>, is presented. The PE consists of a simple serial arithmetic unit, a local high speed data memory, serial input and output ports, serial communication channels with neighbouring PE's, and some local control logic. The PE array operates under the control(More)
Many critical applications require both correctness and timeliness. Proper solutions for these systems, called responsive systems [1], ask for approaches that integrate the fault tolerance and real-time facets. However, the two aspects have evolved for long time almost independently from each other. In real time contexts, the usual approach to cope with(More)