Fault Management in Distributed Systems: A Policy-Driven Approach

@article{Lutfiyya2004FaultMI,
  title={Fault Management in Distributed Systems: A Policy-Driven Approach},
  author={H. Lutfiyya and M. Bauer and Andrew D. Marshall and D. Stokes},
  journal={Journal of Network and Systems Management},
  year={2004},
  volume={8},
  pages={499-525}
}
Managing the availability and performance of a distributed system involves monitoring the behavior of the system, identifying system problems, and correcting those problems. Each of these tasks requires some expertise, such as an understanding of the mechanics of the underlying system components. As the size and complexity of these systems increases, and the number of distributed applications executing on these systems increases, managing the availability and performance of distributed systems… Expand
14 Citations
A study of service reliability and availability for distributed systems
A Survey of Fault Management in Wireless Sensor Networks
Dealing with Faults in Wireless Sensor Networks
BPMM: a grid-based architectural framework for business process meta management
Fault Management For Service-Oriented Systems
...
1
2
...

References

SHOWING 1-10 OF 34 REFERENCES
A modeling framework for integrated distributed systems fault management
Making distributed applications manageable through instrumentation
Services Supporting Management of Distributed Applications and Systems
Policies in network and systems management—Formal definition and architecture
  • R. Wies
  • Computer Science
  • Journal of Network and Systems Management
  • 2005
Policies Hierarchies for Distributed Systems Management
Efficient management data acquisition and run-time control of DCE applications using the OSI management framework
On a rule based management architecture
  • T. Koch, B. Kramer, G. Rohde
  • Computer Science
  • Second International Workshop on Services in Distributed and Networked Environments
  • 1995
...
1
2
3
4
...