LogBERT: Log Anomaly Detection via BERT

  title={LogBERT: Log Anomaly Detection via BERT},
  author={Haixuan Guo and Shuhan Yuan and Xintao Wu},
Detecting anomalous events in online computer systems is crucial to protect the systems from malicious attacks or malfunctions. System logs, which record detailed information of computational events, are widely used for system status analysis. In this paper, we propose LogBERT, a self-supervised framework for log anomaly detection based on Bidirectional Encoder Representations from Transformers (BERT). LogBERT learns the patterns of normal log sequences by two novel self-supervised training… Expand

Figures and Tables from this paper

A Taxonomy of Anomalies in Log Data
Log data anomaly detection is a core component in the area of artificial intelligence for IT operations. However, the large amount of existing methods makes it hard to choose the right approach for aExpand
A2Log: Attentive Augmented Log Anomaly Detection
A2Log is developed, which is an unsupervised anomaly detection method consisting of two steps: Anomaly scoring and anomaly decision, which outperforms existing methods and can reach scores of the strong baselines. Expand
FastPacket: Towards Pre-trained Packets Embedding based on FastText for next-generation NIDS
A new approach for embedding packets based on character-level embeddings, inspired by FastText success on text data is proposed, called FastPacket, which is measured on subsets of CIC-IDS-2017 dataset and expects promising results on big data pre-trained models. Expand
Log-based Anomaly Detection Without Log Parsing
This work proposes NeuralLog, a novel log-based anomaly detection approach that does not require log parsing, which extracts the semantic meaning of raw log messages and represents them as semantic vectors and is used to detect anomalies through a Transformer-based classification model, which can capture the contextual information from log sequences. Expand
Unsupervised Cross-system Log Anomaly Detection via Domain Adaptation
  • Xiao Han, Shuhan Yuan
  • Computer Science
  • CIKM
  • 2021
A transferable log anomaly detection (LogTAD) framework is proposed that leverages the adversarial domain adaptation technique to make log data from different systems have a similar distribution so that the detection model is able to detect anomalies from multiple systems. Expand


Online System Problem Detection by Mining Patterns of Console Logs
A novel application of using data mining and statistical learning methods to automatically monitor and detect abnormal execution traces from console logs in an online setting and shows that it can not only achieve highly accurate and fast problem detection, but also help operators better understand execution patterns in their system. Expand
Deep Learning for Anomaly Detection: A Survey
This survey presents a structured and comprehensive overview of research methods in deep learning-based anomaly detection, grouping state-of-the-art deep anomaly detection research techniques into different categories based on the underlying assumptions and approach adopted. Expand
DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning
DeepLog, a deep neural network model utilizing Long Short-Term Memory (LSTM), is proposed, to model a system log as a natural language sequence, which allows DeepLog to automatically learn log patterns from normal execution, and detect anomalies when log patterns deviate from the model trained from log data under normal execution. Expand
Anomaly Detection Using Program Control Flow Graph Mining From Execution Logs
The novelty in this work stems from the new techniques employed to overcome the instrumentation requirements or application specific assumptions made in prior log mining approaches, and improve the accuracy of mined templates and the cfg in the presence of long parameters and high amount of interleaving respectively. Expand
Anomaly detection: A survey
This survey tries to provide a structured and comprehensive overview of the research on anomaly detection by grouping existing techniques into different categories based on the underlying approach adopted by each technique. Expand
Large-Scale System Problems Detection by Mining Console Logs
This work first parse console logs by combining source code analysis with information retrieval to create composite features, and then analyzes these features using machine learning to detect operational problems to automatically detect system runtime problems. Expand
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks. Expand
Detecting large-scale system problems by mining console logs
This work first parse console logs by combining source code analysis with information retrieval to create composite features, and then analyzes these features using machine learning to detect operational problems to automatically detect system runtime problems. Expand
What Supercomputers Say: A Study of Five System Logs
  • A. Oliner, Jon Stearley
  • Computer Science
  • 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07)
  • 2007
This paper examines system logs from five supercomputers with the aim of providing useful insight and direction for future research into the use of such logs, and proposes a simpler and more effective filtering algorithm. Expand
Anomaly intrusion detection using one class SVM
  • Y. Wang, J. Wong, A. Miner
  • Computer Science
  • Proceedings from the Fifth Annual IEEE SMC Information Assurance Workshop, 2004.
  • 2004
This work extends kernel methods to intrusion detection domain by introducing a new family of kernels suitable for intrusion detection, combined with an unsupervised learning method - one-class support vector machine. Expand