CLAP: recording local executions to reproduce concurrency failures

@inproceedings{Huang2013CLAPRL,
  title={CLAP: recording local executions to reproduce concurrency failures},
  author={Jeff Huang and Charles Zhang and Julian T Dolby},
  booktitle={PLDI},
  year={2013}
}
Shared Memory Dependence Reduction via Bisectional Coordination
TLDR
This paper designs and implemented the bisectional coordination protocol, which dynamically maintains a partition of the program’s address space without its prior knowledge, such that shared variables in each partitioned interval have consistent thread and spatial locality properties.
Production-guided concurrency debugging
TLDR
Evaluation on popular benchmarks shows that Cortex is able to expose failing schedules with only a few perturbations to non-failing executions, and takes a practical amount of time.
MESS: Memory Performance Debugging on Embedded Multi-core Systems
TLDR
MESS systematically discovers the order of memory-access operations that expose performance bugs due to shared caches, and proposes an approximate solution that dramatically reduces debugging time, at the cost of a reasonable amount of false positives.
Relaxed Logging for Replay of Multithreaded Applications
TLDR
A study of existing approaches to record and replay, reflects on trade-offs and decisions of each system, and a new approach of relaxed logging that aims at reducing the cost of the record phase without introducing a substantial increase in the time required to execute replay phase are presented.
Report on the international symposium on high confidence software (ISHCS 2011/2012)
To provide a forum for researchers in related research areas to address the challenges in high confidence software, exchange ideas, and foster collaborations, the Institute of Software and Key
Reproducing Concurrency Bugs Using Local Clocks
TLDR
This paper proposes an effective approach that takes advantage of the hardware clocks available on modern commercial processors to reduce the overhead in recording and analyzing events’ global order by using time stamps recorded in each thread.
Towards Production-Run Heisenbugs Reproduction on Commercial Hardware
TLDR
A new technique, H3, is presented, which integrates the hardware control flow tracing capability provided in recent Intel processors with symbolic constraint analysis and allows H3 to reproduce failures with much lower runtime overhead and much more compact trace.
An exploratory study of autopilot software bugs in unmanned aerial vehicles
TLDR
This study conducted the first large-scale empirical study on two well-known open-source autopilot software platforms for UAVs, namely PX4 and Ardupilot, to characterize bugs in Uavs and identified five challenges associated with detecting and fixing such UAV-specific bugs.
RAProducer: efficiently diagnose and reproduce data race bugs for binaries via trace analysis
TLDR
This paper proposes a general solution RAProducer to efficiently diagnose and reproduce data race bugs, for both user-land binary programs and kernels without source code and enables us to diagnose 2 extra real world bugs which are left unconfirmed for a long time.
WATCHER: in-situ failure diagnosis
TLDR
A novel diagnosis system that can pinpoint root causes of program failures within the failing process ("in-situ"), eliminating the privacy concern is presented and two optimizations to reduce the diagnosis time and diagnose failures with control flow hijacks are proposed.
...
1
2
3
4
5
...