Monitoring Partially Synchronous Distributed Systems Using SMT Solvers

  title={Monitoring Partially Synchronous Distributed Systems Using SMT Solvers},
  author={Vidhya Tekken Valapil and Sorrachai Yingchareonthawornchai and Sandeep S. Kulkarni and Eric K. Torng and Murat Demirbas},
In this paper, we discuss the feasibility of monitoring partially synchronous distributed systems to detect latent bugs, i.e., errors caused by concurrency and race conditions among concurrent processes. We present a monitoring framework where we model both system constraints and latent bugs as Satisfiability Modulo Theories (SMT) formulas, and we detect the presence of latent bugs using an SMT solver. We demonstrate the feasibility of our framework using both synthetic applications where… 
Distributed Runtime Verification Under Partial Synchrony
This paper studies the problem of runtime verification of distributed applications that do not share a global clock with respect to specifications in the linear temporal logics and proposes a distributed monitoring algorithm by employing SMT solving techniques.
Efficient Two-Layered Monitor for Partially Synchronous Distributed Systems
This work presents a new, efficient two-layered monitoring approach that overcomes both the time and space limitations of earlier monitors and shows that the combination of these two layers reduces the cost of monitoring by 85-95%.
Crash-Resilient Decentralized Synchronous Runtime Verification
This paper proposes an automata-based synchronous monitoring algorithm that copes with crash monitor failures and emits a symbolic verdict that efficiently encodes their partial views.
Testing for Race Conditions in Distributed Systems via SMT Solving
Data races, a condition where two memory accesses to the same memory location occur concurrently, have been shown to be a major source of concurrency bugs in distributed systems. Unfortunately, data
Using Weaker Consistency Models with Monitoring and Recovery for Improving Performance of Key-Value Stores
Overall, for applications considered in this paper, it is found that even with rollback, eventual consistency provides better performance than using sequential consistency.
Distributed Runtime Verification of Metric Temporal Properties for Cross-Chain Protocols
A generalized runtime verification technique for verifying partially synchronous distributed com- putations for the metric temporal logic ( MTL) by exploiting bounded-skew clock synchronization and a progression-based formula rewriting scheme for monitoring MTL specifications which employs SMT solving techniques and report experimental results.
Retroscope: Retrospective Monitoring of Distributed Systems
The Retroscope search algorithm is embarrassingly-parallel and can employ many worker processes (each processing up to 150,000 consistent snapshots per second) to handle a single query.
Tests and Proofs: 14th International Conference, TAP 2020, Held as Part of STAF 2020, Bergen, Norway, June 22–23, 2020, Proceedings
A test-case generation algorithm and its implementation, in terms of an open-source Matlab toolbox for conformance testing cyber-physical systems, and a number of case-studies conducted in the automotive domain, including a case-study on platooning and another one on doping detection concerning diesel car emissions.


Precision, Recall, and Sensitivity of Monitoring Partially Synchronous Distributed Systems
Runtime verification focuses on analyzing the execution of a given program by a monitor to determine if it is likely to violate its specifications. There is often an impedance mismatch between the
Efficient Algorithms for Predicate Detection using Hybrid Logical Clocks
This work focuses on using hybrid logical clocks (HLCs) to perform wait-free and efficient predicate detection and developing efficient algorithms for detecting weak conjunctive predicates (WCPs) with the help of HLC and then extending them to detect arithmetic predicates such as those necessary for expressing resource usage, network density, and so on.
Decentralized Runtime Verification of LTL Specifications in Distributed Systems
This thesis proposes the first sound and complete method for runtime verification of asynchronous distributed programs for the 3-valued semantics of LTL specifications defined over the global state of the program.
Efficient and Generalized Decentralized Monitoring of Regular Languages
This paper proposes an efficient and generalized decentralized monitoring algorithm allowing to detect satisfaction or violation of any regular specification by local monitors alone in a system
Predicate Detection in Asynchronous Distributed Systems: A Probabilistic Approach
A unified algorithm framework for detecting various types of predicates is proposed and used, including simple predicates, simple sequences, and interval-constrained sequences, which shows that the approach is effective and outperforms existing approaches.
Efficient decentralized monitoring of safety in distributed systems
An efficient decentralized monitoring algorithm that monitors a distributed program's execution to check for violations of safety properties and introduces the notion of Knowledge Vector and an algorithm which keeps a process aware of other processes' local states that can affect the validity of a monitored PT-DTL formula.
Detection of Weak Unstable Predicates in Distributed Programs
This paper discusses detection of weak conjunctive predicates that are formed by conjunction of predicates local to processes in the system, and detects even unstable predicates, without excessive overhead.
Beyond TrueTime : Using AugmentedTime for Improving Spanner
AugmentedTime (AT), which combines the best of TT-based wallclock ordering with causality-based ordering in asynchronous distributed systems, is proposed, which can be used in lieu of (or in addition to) TT in Spanner for timestamping and querying data efficiently.
A Distributed Abstraction Algorithm for Online Predicate Detection
This paper presents first distributed online algorithm for computing the slice of a distributed computation with respect to a regular predicate, and distributes the work and storage requirements across the system, thus reducing the space and computation complexity per process.
Detecting causal relationships in distributed computations: In search of the holy grail
It is shown that characterizing the causal relationship between significant events is an important but non-trivial aspect for understanding the behavior of distributed programs.