• Corpus ID: 993044

Early Detection of Configuration Errors to Reduce Failure Damage

@inproceedings{Xu2017EarlyDO,
  title={Early Detection of Configuration Errors to Reduce Failure Damage},
  author={Tianyin Xu and Xinxin Jin and Peng Huang and Yuanyuan Zhou and Shan Lu and Long Jin and Shankar Pasupathy},
  booktitle={USENIX Annual Technical Conference},
  year={2017}
}
Early detection is the key to minimizing failure damage induced by configuration errors, especially those errors in configurations that control failure handling and fault tolerance. Since such configurations are not needed for initialization, many systems do not check their settings early (e.g., at startup time). Consequently, the errors become latent until their manifestations cause severe damage, such as breaking the failure handling. Such latent errors are likely to escape from sysadmins… 
Testing Configuration Changes in Context to Prevent Production Failures
TLDR
The idea behind ctests is simple—connecting production system configurations to software tests so that configuration changes can be tested in the context of code affected by the changes, and it effectively detects real-world failure-inducing configuration changes, diverse injected mis configurationurations and misconfigurations in the deployed files.
Understanding, Detecting and Localizing Partial Failures in Large System Software
TLDR
OmegaGen, a static analysis tool that automatically generates customized watchdogs for a given program by using a novel program reduction technique, is proposed and successfully applied to six large distributed systems.
Test-case prioritization for configuration testing
TLDR
Inspired by traditional test-case prioritization (TCP) that aims to reorder test executions to speed up detection of regression code faults, this study proposes to apply TCP to re order ctests to speedup detection of misconfigurations.
A Real-Time Detection Method of Software Configuration Errors Based on Fine-Grained Configuration Item Types
TLDR
An automatic real-time detection method of software configuration errors, in which the configuration items are classified based on the fine-grained configuration item types and related syntax patterns, and the configuration constraint rule base is generated.
Early Detection of Con guration Errors by Learning Con guration Rules
TLDR
Two rule-learning frameworks proposed in recent research show great promise for helping to prevent misconfigurations by automatically derive rules from the configurations of other deployed systems and using them to verify the configuration of a system before it gets used.
ConfVD: System Reactions Analysis and Evaluation Through Misconfiguration Injection
TLDR
This paper studies eight mature open-source and commercial software packages and summarizes a fine-grained classification of option types and proposes misconfiguration generation methods for their constraints, and implements a tool named Configuration Vulnerability Detector (ConfVD), which could improve generic alteration approaches without constraints.
Automated Reasoning and Detection of Specious Configuration in Large Systems with Symbolic Execution
TLDR
Violet takes a novel approach that uses selective symbolic execution to systematically reason about the performance effect of configuration parameters, their combination effect, and the relationship with input and outputs a performance impact model for the automatic detection of poor configuration settings.
CADET: Debugging and Fixing Misconfigurations using Counterfactual Reasoning
TLDR
The experimental results indicate that CADET can find effective repairs for faults in multiple non-functional properties with (at most) 13% more accuracy, 32% higher gain, and 13× speed-up than other ML-based performance debugging methods.
Statically Verifying Continuous Integration Configurations
TLDR
The first approach, VeriCI, for statically checking for errors in a given CI configuration before the developer pushes a commit to build on the CI server, and the Misclassification Guided Abstraction Refinement loop that automates part of the learning process across the heterogeneous build environments in CI.
Check before You Change: Preventing Correlated Failures in Service Updates
TLDR
This paper presents CloudCanary, a system that can perform real-time audits on service updates to identify the root causes of correlated failure risks, and generate improvement plans with increased reliability with two primitives, SNAPAUDIT and DEPBOOSTER.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 69 REFERENCES
Systems Approaches to Tackling Configuration Errors
TLDR
A holistic and structured overview of the systems approaches that tackle configuration errors and the potential solutions for resolving configuration errors in the spectrum of system development and management is provided.
Automated diagnosis of software configuration errors
TLDR
This work presents a technique (and its tool implementation, called ConfDiagnoser) to identify the root cause of a configuration error - a single configuration option that can be changed to produce desired behavior.
Context-based Online Configuration-Error Detection
TLDR
CODE is a tool that automatically detects software configuration errors based on identifying invariant configuration access rules that predict what access events follow what contexts and can sift through a voluminous number of events and detect deviant program executions.
Do not blame users for misconfigurations
TLDR
A tool to automatically infer configuration requirements from software source code and use the inferred constraints to expose misconfiguration vulnerabilities and detect certain types of error-prone configuration design and handling, which has influenced the Squid Web proxy project to improve its configuration parsing library towards a more user-friendly design.
Configuration Debugging as Search: Finding the Needle in the Haystack
TLDR
The Chronus tool is presented, which automates the task of searching for a failure-inducing state change and can diagnose a range of common configuration errors for both client-side and server-side applications, and that the performance overhead of the tool is not prohibitive.
X-ray: Automating Root-Cause Diagnosis of Performance Anomalies in Production Software
TLDR
X-ray is a tool that implements performance summarization, a technique for automatically diagnosing the root causes of performance problems and shows that X-ray accurately diagnoses 17 performance issues in Apache, lighttpd, Postfix, and PostgreSQL, while adding 2.3% average runtime overhead.
Proactive detection of inadequate diagnostic messages for software configuration errors
TLDR
A technique to detect inadequate diagnostic messages for configuration errors issued by a configurable software system that injects configuration errors into the software under test, monitors the software outcomes under the injected configuration errors, and uses natural language processing to analyze the output diagnostic message caused by each configuration error.
Automatic Root-cause Diagnosis of Performance Anomalies in Production Software
TLDR
Xray, a tool called X-ray that performs performance summarization, accurately diagnoses 14 performance issues in the Apache HT TP server, Postfix mail server and PostgreSQL database, while a dding only 1–7% overhead to production systems.
EnCore: exploiting system environment and correlation information for misconfiguration detection
TLDR
A framework and tool called EnCore to automatically detect software misconfigurations, which takes into account two important factors that are unexploited before: the interaction between the configuration settings and the executing environment, as well as the rich correlations between configuration entries.
Precomputing possible configuration error diagnoses
  • A. Rabkin, R. Katz
  • Computer Science
    2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011)
  • 2011
TLDR
This work builds a map from each program point to the options that might cause an error at that point, which reduces the number of false positives by nearly a factor of four for Hadoop, at the cost of approximately one minute's work per unique query.
...
1
2
3
4
5
...