A practical scheme for MPLS fault monitoring and alarm correlation in backbone networks
We propose a new distributed alarm correlation and fault identification in computer networks. The managed network is divided into a disjoint management domains and each management domain is assigned a dedicated intelligent agent. The intelligent agent is responsible for collecting, analyzing, and correlating alarms emitted form emitted from its constituent entities in its domain. In the framework of Dempster-Shafer evidence theory, each agent perceives each alarm as a piece of evidence in the occurrence of a certain fault hypothesis and correlates the received alarms into a single alarm called local composite alarm, which encapsulates the agent’s partial view of the current status of the managed system. While the alarm correlation process is performed locally, each intelligent agent is able to correlate its alarms globally. These local composite alarms are, in turn, sent to a higher agent whose task is to fuse these alarms and form a global view of operation status of the running network. Extensive experimentations have demonstrated that the proposed approach is more alarm loss tolerant than the codebook based approaches and hence shown its effectiveness in a usually noisy network environment.