A Lightweight Algorithm for Message Type Extraction in System Application Logs

@article{Makanju2012ALA,
  title={A Lightweight Algorithm for Message Type Extraction in System Application Logs},
  author={Adetokunbo Makanju and Ayse Nur Zincir-Heywood and Evangelos E. Milios},
  journal={IEEE Transactions on Knowledge and Data Engineering},
  year={2012},
  volume={24},
  pages={1921-1936}
}
Message type or message cluster extraction is an important task in the analysis of system logs in computer networks. Defining these message types automatically facilitates the automatic analysis of system logs. When the message types that exist in a log file are represented explicitly, they can form the basis for carrying out other automatic application log analysis tasks. In this paper, we introduce a novel algorithm for carrying out message type extraction from event log files. IPLoM, which… 
Use of Incremental Clustering in Clustering of Message Types in a Server Log
TLDR
It is being proposed to apply the incremental clustering to extract the data from the event log as per the characteristics provided by the users of the system.
An Algorithm for Message Type Discovery in Unstructured Log Data
TLDR
This work proposes a novel algorithm for message type discovery that is able to discover message types in already generated log data and identifies several deficiencies of the existing algorithms which are limiting their capabilities.
A Search-Based Approach for Accurate Identification of Log Message Formats
TLDR
The MoLFI approach, which recasts the log message identification problem as a multi-objective problem, uses an evolutionary approach to solve this problem, by tailoring the NSGA-II algorithm to search the space of solutions for a Pareto optimal set of message templates.
On Handling Redundancy for Failure Log Analysis of Cluster Systems
TLDR
A novel, generic log compression or filtering (i.e., redundancy removal) technique to address the problem of small number of system event logs, which enable system administrators to determine the causes and detect system failures.
An Efficient Log Parsing Algorithm Based on Heuristic Rules
TLDR
The proposed CLF (Clustering based on Length and First token) algorithm for extracting log event templates from raw log based on heuristic rules is compared with three state-of-the-art log parser algorithms, where CLF ranks higher on most of the data sets and also has advantages in execution time.
Spell: Online Streaming Parsing of Large Unstructured System Logs
  • Min DuFeifei Li
  • Computer Science
    IEEE Transactions on Knowledge and Data Engineering
  • 2019
TLDR
This work proposes an online streaming method, Spell, which utilizes a longest common subsequence based approach, to parse system event logs, and shows its superiority in terms of both efficiency and effectiveness.
An online log template extraction method based on hierarchical clustering
TLDR
The experimental analysis shows that LogOHC has a higher F1-score than the existing log template extraction methods, is suitable for multi-source log data sets, and has a shorter single-step execution time, which can meet the requirements of online real-time processing.
The Use of Template Miners and Encryption in Log Message Compression
TLDR
This paper uses six template miners to acquire the templates and evaluates the compression capacity of the dictionary method with the use of these algorithms, and examines the speed of the log miner algorithms.
A Directed Acyclic Graph Approach to Online Log Parsing
TLDR
An online log parsing method, namely Drain, based on directed acyclic graph, which encodes specially designed rules for parsing, which has the highest accuracy on all 11 datasets and frees developers from the burden of parameter tuning by allowing them use Drain with no pre-defined parameters.
FastLogSim: A Quick Log Pattern Parser Scheme Based on Text Similarity
TLDR
This paper proposes FastLogSim, a fast log parsing scheme based on text similarity that not only reduces the number of templates that need to be parsed from tens of millions to dozens, but also greatly improves the speed of pattern extraction.
...
...

References

SHOWING 1-10 OF 39 REFERENCES
Storage and retrieval of system log events using a structured schema based on message type transformation
TLDR
This work shows how message types can be used to impose structure on the unstructured content of event logs and how this structured representation can provide a usable index for searching the contents of the log file.
LogView: Visualizing Event Log Clusters
TLDR
The results based on different application log files show that LogView can ease the summarization of vast amount of data contained in the log files, which in turn can help to speed up the analysis of event data in order to detect any security issues on a given application.
An integrated framework on mining logs files for computing system management
TLDR
This paper applies text mining techniques to categorize messages in log files into common situations, improve categorization accuracy by considering the temporal characteristics of log messages, and utilize visualization tools to evaluate and validate the interesting temporal patterns for system management.
Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis
TLDR
This paper proposes an unstructured log analysis technique for anomalies detection and proposes a novel algorithm to convert free form text messages in log files to log keys without heavily relying on application specific knowledge.
A Breadth-First Algorithm for Mining Frequent Patterns from Event Logs
TLDR
The properties of eventlog data are discussed, the suitability of popular mining algorithms for processing event log data is analyzed, and an efficient algorithm for mining frequent patterns from event logs is proposed.
A data clustering algorithm for mining patterns from event logs
  • R. Vaarandi
  • Computer Science
    Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM 2003) (IEEE Cat. No.03EX764)
  • 2003
TLDR
A novel clustering algorithm for log file data sets is presented which helps one to detect frequent patterns from log files, to build log file profiles, and to identify anomalous log file lines.
Alert Detection in System Logs
TLDR
This work formalizes the alert detection task in these terms, describes how Nodeinfo uses the information entropy of message terms to identify alerts, and presents an online version of this algorithm, which is now in production use.
Fast entropy based alert detection in super computer logs
TLDR
This work shows that with Message Type Indexing (MTI) the computational effort required for alert detection can be reduced by up to 99%.
Process Mining in Web Services: The WebSphere Case
TLDR
This paper illustrates the potential of process mining in the context of web services and shows what a process mining tool like ProM can contribute in IBM’s WebSphere environment.
Capturing, indexing, clustering, and retrieving system history
We present a method for automatically extracting from a running system an indexable signature that distills the essential characteristic from a system state and that can be subjected to automated
...
...