Skip to search formSkip to main content
You are currently offline. Some features of the site may not work correctly.

Fault tolerance

Known as: Fail-soft operation, Fault-tolerance, Damage tolerant design 
Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of (or one or more faults within… Expand
Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2016
Highly Cited
2016
TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. Tensor-Flow uses dataflow… Expand
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Highly Cited
2007
Highly Cited
2007
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute… Expand
  • table 1
  • figure 1
  • figure 2
  • figure 3
  • figure 4
Highly Cited
2006
Highly Cited
2006
Fundamentals.- Supervision and fault management of processes - tasks and terminology.- Reliability, Availability and… Expand
Highly Cited
2005
Highly Cited
2005
To improve performance and reduce power, processor designers employ advances that shrink feature sizes, lower voltage levels… Expand
  • table 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Highly Cited
2005
Highly Cited
2005
This paper describes a general approach to constructing cooperative services that span multiple administrative domains. In such… Expand
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Highly Cited
1999
Highly Cited
1999
  • E. Rotenberg
  • Digest of Papers. Twenty-Ninth Annual…
  • 1999
  • Corpus ID: 9898124
This paper speculates that technology trends pose new challenges for fault tolerance in microprocessors. Specifically, severely… Expand
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Highly Cited
1995
Highly Cited
1995
  • J. Laprie
  • Twenty-Fifth International Symposium on Fault…
  • 1995
  • Corpus ID: 110377226
This paper provides a concepeual framework for expressing the attributes of what constitutes dependable and reliable computing… Expand
Highly Cited
1990
Highly Cited
1990
1 Introduction.- Fault Prevention and Fault Tolerance.- Anticipated and Unanticipated Faults.- Book Aim.- References.- 2 System… Expand
Highly Cited
1989
Highly Cited
1989
An Information Dispersal Algorithm (IDA) is developed that breaks a file <italic>F</italic> of length <italic>L</italic… Expand
Highly Cited
1984
Highly Cited
1984
The rapid progress in VLSI technology has reduced the cost of hardware, allowing multiple copies of low-cost processors to… Expand
  • figure 1
  • figure 2
  • table I
  • figure 6
  • figure 3