Are mutants a valid substitute for real faults in software testing?

@article{Just2014AreMA,
  title={Are mutants a valid substitute for real faults in software testing?},
  author={Ren{\'e} Just and Darioush Jalali and Laura Inozemtseva and Michael D. Ernst and Reid Holmes and Gordon Fraser},
  journal={Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering},
  year={2014}
}
  • René Just, D. Jalali, G. Fraser
  • Published 11 November 2014
  • Computer Science
  • Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering
A good test suite is one that detects real faults. Because the set of faults in a program is usually unknowable, this definition is not useful to practitioners who are creating test suites, nor to researchers who are creating and evaluating tools that generate test suites. In place of real faults, testing research often uses mutants, which are artificial faults -- each one a simple syntactic variation -- that are systematically seeded throughout the program under test. Mutation analysis is… 
Using Controlled Numbers of Real Faults and Mutants to Empirically Evaluate Coverage-Based Test Case Prioritization
TLDR
The overall findings are that, in comparison to mutants, real faults are harder for reordered test suites to quickly detect, suggesting that mutants are not a surrogate for real faults.
Mutation Analysis for the Real World: Effectiveness, Efficiency, and Proper Tool Support
TLDR
This talk will address challenges and summarize the recent contributions in the area of mutation analysis with a focus on effectiveness, efficiency, and tool support.
Defects4J: a database of existing faults to enable controlled testing studies for Java programs
TLDR
Defects4J, a database and extensible framework providing real bugs to enable reproducible studies in software testing research, and provides a high-level interface to common tasks in softwareTesting research, making it easy to con- duct and reproduce empirical studies.
Using mutants to help developers distinguish and debug (compiler) faults
TLDR
The approach, although devised for compilers, is applicable as a conservative fault localization algorithm for other types of programs and can help triage certain types of crashes found in fuzzing non‐compiler programs more effectively than a state‐of‐the‐art technique.
Which Software Faults Are Tests Not Detecting?
TLDR
The aim is to suggest to developers specific ways in which their tests need to be improved to increase fault detection, and recommends that developers do not rely only on code coverage and mutation score to measure the effectiveness of their tests.
Detecting Trivial Mutant Equivalences via Compiler Optimisations
TLDR
The new results suggest that TCE may be particularly effective, finding almost half of all equivalent mutants in the case of Java.
Predictive Mutation Testing
TLDR
PMT constructs a classification model, based on a series of features related to mutants and tests, and uses the model to predict whether a mutant would be killed or remain alive without executing it, and has high predictability when predicting the execution results of the majority of mutants.
Tailored Mutants Fit Bugs Better
TLDR
A new approach to mutant selection focusing on the location at which to apply mutation operators and the unnaturalness of the mutated code is proposed, and it is demonstrated that selecting the location where a mutation operator is applied decreases the number of generated mutants without affecting the coupling of mutants and real faults.
An Empirical Study to Determine if Mutants Can Effectively Simulate Students' Programming Mistakes to Increase Tutors' Confidence in Autograding
TLDR
Whether mutants are capable of replicating mistakes made by students' faults is investigated, and it is found that generated mutants capture the observed faulty behaviour of students' solutions and better assess test adequacy than code coverage in some cases.
Evaluation and improvement of automated software test suites
TLDR
This dissertation proposed, implemented and evaluated a light-weight mutation approach to identify pseudo-tested methods; that is, methods that are less resource-intensive than currently used mutation testing approaches.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 54 REFERENCES
Defects4J: a database of existing faults to enable controlled testing studies for Java programs
TLDR
Defects4J, a database and extensible framework providing real bugs to enable reproducible studies in software testing research, and provides a high-level interface to common tasks in softwareTesting research, making it easy to con- duct and reproduce empirical studies.
Is mutation an appropriate tool for testing experiments?
TLDR
It is concluded that, based on the data available thus far, the use of mutation operators is yielding trustworthy results (generated mutants are similar to real faults); Mutants appear however to be different from hand-seeded faults that seem to be harder to detect than real faults.
The use of mutation in testing experiments and its sensitivity to external threats
TLDR
The results of controlled experiments conducted in this paper show that mutation when used in testing experiments is highly sensitive to external threats caused by some influential factors including mutation operators, test suite size, and programming languages.
Achieving scalable mutation-based generation of whole test suites
TLDR
The search-based EvoSuite test generation tool integrates two novel optimizations that avoid redundant test executions on mutants by monitoring state infection conditions, and uses whole test suite generation to optimize test suites towards killing the highest number of mutants, rather than selecting individual mutants.
Sufficient mutation operators for measuring test effectiveness
TLDR
This paper addresses the problem of finding a small set of mutation operators which is still sufficient for measuring test effectiveness by defining a statistical analysis procedure that allows it to identify such a set, together with an associated linear model that predicts mutation adequacy with high accuracy.
JCrasher: an automatic robustness tester for Java
TLDR
JCrasher attempts to detect bugs by causing the program under test to ‘crash’, that is, to throw an undeclared runtime exception, to test the behavior of public methods under random data.
Do Redundant Mutants Affect the Effectiveness and Efficiency of Mutation Analysis?
TLDR
This paper convincingly demonstrates that it is possible to improve the effectiveness and efficiency of a mutation analysis system by identifying and removing redundant mutants.
Test generation via Dynamic Symbolic Execution for mutation testing
TLDR
A general test-generation approach, called PexMutator, for mutation testing using Dynamic Symbolic Execution (DSE), a recent effective test- generation technique, which is able to strongly kill more than 80% of all the mutants for the five studied subjects.
Software error analysis: a real case study involving real faults and mutations
TLDR
It was observed that although the studied mutations were simple faults, they can create erroneous behaviors as complex as those identified for the real faults, which lends support to the representativeness of errors due to mutations.
On guiding the augmentation of an automated test suite via mutation analysis
TLDR
An empirical study of the use of mutation analysis on two open source projects indicates that a focused effort on increasing mutation score leads to a corresponding increase in line and branch coverage to the point that line coverage, branch coverage and mutation score reach a maximum but leave some types of code structures uncovered.
...
1
2
3
4
5
...