Fabrizio Grandoni

Learn More
ÐThis paper presents a class of count-and-threshold mechanisms, collectively named-count, which are able to discriminate between transient faults and intermittent faults in computing systems. For many years, commercial systems have been using transient fault discrimination via threshold-based techniques. We aim to contribute to the utility of(More)
In this paper the consolidate identification of faults, distinguished as transient or permanent/intermittent, is approached. Transient faults discrimination has long been performed in commercial systems: threshold-based techniques have been practice for several years for this purpose. The present work aims to contribute to the usefulness of the(More)
use of the material in this paper is permitted. However, permission to reprint or republish this material for advertising or promotional purposes or for creating new works for resale or redistribution, or to reuse any copyrighted component of this work in other works must be obtained from the authors of this paper. Abstract Mechanisms for restoring the(More)
The structure of the Processing Element (PE), which is the basic component of SMA<supscrpt>1</supscrpt>, is presented. The PE consists of a simple serial arithmetic unit, a local high speed data memory, serial input and output ports, serial communication channels with neighbouring PE's, and some local control logic. The PE array operates under the control(More)
This paper deals with multiprocessor systems required to provide both high performance and good figures of dependability attributes. Fault tolerance is pursued through a proper combination and integration of a diagnostic mechanism, called α-count, with simple instances of redundancy-based error processing structures. The resulting fault tolerance strategies(More)
models of self-diagnostic techniques are well known, and established theoretical results are present in the literature. Multiprocessor systems are naturally suited for the application of these techniques but no design is currently reported which embeds these techniques. This paper deals with the problems arising in the implementation of self–diagnostic(More)
Many critical applications require both correctness and timeliness. Proper solutions for these systems, called responsive systems [1], ask for approaches that integrate the fault tolerance and real-time facets. However, the two aspects have evolved for long time almost independently from each other. In real time contexts, the usual approach to cope with(More)
The largely computerised nature of critical infrastructures on the one hand, and the pervasive interconnection of systems all over the world, on the other hand, have generated one of the most fascinating current problems of computer science and control engineering: how to achieve resilience of critical information infrastructures, in particular in the(More)