• Corpus ID: 239769218

A Comprehensive Survey of Logging in Software: From Logging Statements Automation to Log Mining and Analysis

  title={A Comprehensive Survey of Logging in Software: From Logging Statements Automation to Log Mining and Analysis},
  author={Sina Gholamian and Paul A. S. Ward},
Logs are widely used to record runtime information of software systems, such as the timestamp and the importance of an event, the unique ID of the source of the log, and a part of the state of a task’s execution. The rich information of logs enables system developers (and operators) to monitor the runtime behaviors of their systems and further track down system problems and perform analysis on log data in production settings. However, the prior research on utilizing logs is scattered and that… 
Log severity level classification: an approach for systems in production
This research aims to decrease the overheads of monitoring systems by processing the severity level of log data from systems in production and develops an automated approach to log severity level classification, demonstrating that reducing log severitylevel “noise” improves the monitoring of systems inproduction.


Loghub: A Large Collection of System Log Datasets towards Automated Log Analytics
Lghub provides 17 real-world log datasets collected from a wide range of systems, including distributed systems, supercomputers, operating systems, mobile systems, server applications, and standalone software, which facilitate more research on AI-powered log analytics.
DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning
DeepLog, a deep neural network model utilizing Long Short-Term Memory (LSTM), is proposed, to model a system log as a natural language sequence, which allows DeepLog to automatically learn log patterns from normal execution, and detect anomalies when log patterns deviate from the model trained from log data under normal execution.
FLAP: An End-to-End Event Log Analysis Platform for System Management
This paper designs and implements an integrated system, called FIU Log Analysis Platform, that aims to facilitate the data analytics for system event logs, and provides an end-to-end solution that utilizes advanced data mining techniques to assist log analysts to conveniently, timely, and accurately conduct event log knowledge discovery, system status investigation, and system failure diagnosis.
Characterizing logging practices in Java-based open source software projects – a replication study in Apache Software Foundation
A replication study of 21 different Java-based open source projects from three different categories shows that all projects contain logging code, which is actively maintained, however, contrary to the original study, bug reports containing log messages take a longer time to resolve than bug reports without log messages.
Logging Library Migrations: A Case Study for the Apache Software Foundation Projects
It is found that over 70% of the migrated projects encounter on average two post-migration bugs due to the new logging library, and performance is rarely improved after a migration.
The Bones of the System: A Case Study of Logging and Telemetry at Microsoft
It is found that the use of event data span every job role in the interviews and survey, that different perspectives on event data create tensions between roles or teams, and that professionals report social and technical challenges across activities.
Log2: A Cost-Aware Logging Mechanism for Performance Diagnosis
The experimental results show that Log2 can control logging overhead while preserving logging effectiveness, a cost-aware logging mechanism that is implemented on an open source system as well as a real-world online service system from Microsoft.
LogCluster - A data clustering and pattern mining algorithm for event logs
The LogCluster algorithm is presented, which implements data clustering and line pattern mining for textual event logs and an open source implementation of LogClusters is described.
The Unified Logging Infrastructure for Data Analytics at Twitter
This paper presents Twitter's production logging infrastructure and its evolution from application-specific logging to a unified "client events" log format, where messages are captured in common, well-formatted, flexible Thrift messages.
Improving software diagnosability via log enhancement
A tool, LogEnhancer, is described that automatically "enhances" existing logging code to aid in future post-failure debugging and can dramatically reduce the set of potential root failure causes that must be considered during diagnosis while imposing negligible overheads.