Efficient Large-Scale Trace Checking Using MapReduce

@article{Bersani2016EfficientLT,
  title={Efficient Large-Scale Trace Checking Using MapReduce},
  author={Marcello M. Bersani and Domenico Bianculli and Carlo Ghezzi and Srdan Krstic and Pierluigi San Pietro},
  journal={2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)},
  year={2016},
  pages={888-898}
}
The problem of checking a logged event trace against a temporal logic specification arises in many practical cases. Unfortunately, known algorithms for an expressive logic like MTL (Metric Temporal Logic) do not scale with respect to two crucial dimensions: the length of the trace and the size of the time interval of the formula to be checked. The former issue can be addressed by distributed and parallel trace checking algorithms that can take advantage of modern cloud computing and programming… 

Figures from this paper

A Model-Driven Approach to Offline Trace Checking of Temporal Properties with OCL
TLDR
The goal of this paper is to present a practical and scalable solution for the offline checking of the temporal requirements of a system, which can be used in contexts where model-driven engineering is already a practice, and where relying on standards and industry-strength tools for property checking is a fundamental prerequisite.
A Model-driven Approach to Trace Checking of Temporal Properties with Aggregations
TLDR
This paper presents Tem Psy-AG, an extension of TemPsy—an existing pattern-based language for the specification of temporal properties—to support service provisioning patterns that use aggregation operators, and extends an existing model-driven procedure for trace checking, to verify properties expressed in TemPsy-AG.
A Model-Driven Approach to Trace Checking of Pattern-Based Temporal Properties
  • Wei Dou, D. Bianculli, L. Briand
  • Computer Science
    2017 ACM/IEEE 20th International Conference on Model Driven Engineering Languages and Systems (MODELS)
  • 2017
TLDR
The results of the evaluation show the feasibility of applying the model-driven approach for trace checking in realistic settings: TEMPSY-CHECK scales linearly with respect to the length of the input trace and can analyze traces with one million events in about two seconds.
Inferring software behavioral models with MapReduce
Inferring Software Behavioral Models with MapReduce
TLDR
With the parallel data processing capacity of MapReduce, the problem of inferring behavioral models from large logs can be efficiently solved and the technique is implemented on top of Hadoop.
Scalable Online Monitoring of Distributed Systems
TLDR
It is argued that scalable online monitors must ingest events from multiple sources in parallel, and a general model for input to such monitors is proposed, which only assumes a low-resolution global clock and allows for out-of-order events, which makes it suitable for distributed systems.
Scalable online first-order monitoring
TLDR
This work shows how to scale up first-order monitoring to substantially higher velocities by slicing the stream, based on the events’ data values, into substreams that can be monitored independently.
A Survey of Challenges for Runtime Verification from Advanced Application Domains (Beyond Software)
TLDR
This paper presents a collection of challenges for runtime verification extracted from concrete application domains, focusing on the difficulties that must be overcome to tackle these specific challenges.
Scalable Online First-Order Monitoring
TLDR
This work scales up monitoring to higher velocities by slicing the stream, based on the events’ data values, into substreams that can be independently monitored, and implements the resulting automatic data slicer in Apache Flink and uses the MonPoly tool to monitor the substREAMs.
The hypercube and other hash-based partitioning schemes are sensitive to skew
TLDR
This work scales up monitoring to higher velocities by slicing the stream, based on the events’ data values, into substreams that can be independently monitored, and implements the resulting automatic data slicer in Apache Flink and uses the MonPoly tool to monitor the substREAMs.
...
1
2
...

References

SHOWING 1-10 OF 35 REFERENCES
Parallelized Runtime Verification of First-order LTL Specifications
TLDR
This paper presents a novel and efficient parallel algorithm for verifying a highly expressive fragment of first-order Ltl specifications, where nested quantifiers can be subject to second-order numerical constraints.
MapReduce for parallel trace validation of LTL properties
We present an algorithm for the automated verification of Linear Temporal Logic formulæ on event traces using an increasingly popular cloud computing framework called MapReduce. The algorithm can
Monitoring Algorithms for Metric Temporal Logic Specifications
Monitoring Parametric Temporal Logic
TLDR
This work applies runtime verification to obtain quantitative information about the execution, based on linear-time temporal properties: the temporal specification is extended to include parameters that are instantiated according to a measure obtained at runtime.
Validating Real-time Systems By History-checking TRIO Specifications
TLDR
An efficient algorithm to perform history-checking, i.e., to check tha a history of the system satisfies the specification is presented, and this algorithm can be used as a basis for an effective specification testing tool.
Trace Checking of Metric Temporal Logic with Aggregating Modalities Using MapReduce
TLDR
Logs can be analyzed using trace checking techniques to check whether the system complies with its requirements specifications, which include timing constraints as well as higher-level constraints on the occurrences of significant events, expressed using aggregate operators.
A Finite-Domain Semantics for Testing Temporal Logic Specifications
TLDR
The need for correcting previous semantics proposals is shown, especially in the case of specifications of real-time systems which require the use of bounded temporal operators.
Scalable Offline Monitoring
TLDR
This work proposes an approach to monitoring IT systems offline, where system actions are logged in a distributed file system and subsequently checked for compliance against policies formulated in an expressive temporal logic, and develops a formal framework for slicing logs and an algorithmic realization based on MapReduce.
Online Monitoring of Metric Temporal Logic
TLDR
This paper adapts a separation technique enabling us to rewrite arbitrary MTL formulas into LTL formulas over a set of atoms comprising bounded MTL equations, and obtains the first trace-length independent online monitoring procedure for full MTL in a dense-time setting.
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
TLDR
Resilient Distributed Datasets is presented, a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault-tolerant manner and is implemented in a system called Spark, which is evaluated through a variety of user applications and benchmarks.
...
1
2
3
4
...