Defects4J: a database of existing faults to enable controlled testing studies for Java programs

  title={Defects4J: a database of existing faults to enable controlled testing studies for Java programs},
  author={Ren{\'e} Just and Darioush Jalali and Michael D. Ernst},
  booktitle={International Symposium on Software Testing and Analysis},
Empirical studies in software testing research may not be comparable, reproducible, or characteristic of practice. [] Key Method This framework also provides a high-level interface to common tasks in software testing research, making it easy to con- duct and reproduce empirical studies. Defects4J is publicly available at

Figures and Tables from this paper

Bugs in the wild: examining the effectiveness of static analyzers at finding real-world bugs

This paper presents a preliminary study on the popular static analyzers ErrorProne and SpotBugs, and shows that the analyzers are relatively easy to incorporate into the tool chain of diverse projects that use the Maven build system.

BugsJS: a Benchmark of JavaScript Bugs

BugsJS is proposed, a benchmark of 453 real, manually validated JavaScript bugs from 10 popular JavaScript server-side programs, comprising 444k LOC in total, which facilitates conducting highly-reproducible empirical studies and comparisons of JavaScript analysis and testing tools.

BUGSJS: a benchmark and taxonomy of JavaScript bugs

BugsJS is a benchmark of 453 real, manually validated JavaScript bugs from 10 popular JavaScript server‐side programs, comprising 444k lines of code (LOC) in total, and a classification of the bugs according to their nature is performed, which shows that the taxonomy is adequate for characterizing the bugs in BugsJS.

How effective are mutation testing tools? An empirical analysis of Java mutation testing tools with manual analysis and real faults

There are large differences between the tools’ effectiveness and it is demonstrated that no tool is able to subsume the others and overall, PITRV achieves the best results, by finding 6% more faults than the other tools combined.

Static Automated Program Repair for Heap Properties

This work conducts the largest study of automatically fixing undiscovered bugs in real-world code to date, and presents a new automated program repair technique using Separation Logic that finds and then accurately fixes real bugs without test cases.

Can defects be fixed with weak test suites? An analysis of 50 defects from Defects4J

To understand to what extent defects can be fixed with weak test suites, this work analyzed 50 real world defects from Defects4J, in which it was found that up to 84% of them could be correctly fixed.

Automatic repair of real bugs in java: a large-scale experiment on the defects4j dataset

The result of the experiment shows that the considered state-of-the-art repair methods can generate patches for 47 out of 224 bugs, however, those patches are only test-suite adequate, which means that they pass the test suite and may potentially be incorrect beyond the test-Suite satisfaction correctness criterion.

Effective and scalable fault injection using bug reports and generative language models

iBiR is proposed, the first fault injection approach that leverages information from bug reports to inject ”realistic” faults, which outperforms significantly conventional mutation testing in terms of injecting faults that semantically resemble and couple with real ones, in the vast majority of the cases.

HyperPUT: Generating Synthetic Faulty Programs to Challenge Bug-Finding Tools

The proposed HyperPUT technique, called HyperPUT, builds C programs from a “seed” bug by incrementally applying program transformations (introducing programming constructs such as conditionals, loops, etc.) until a program of the desired size is generated.

Using Controlled Numbers of Real Faults and Mutants to Empirically Evaluate Coverage-Based Test Case Prioritization

The overall findings are that, in comparison to mutants, real faults are harder for reordered test suites to quickly detect, suggesting that mutants are not a surrogate for real faults.



The major mutation framework: efficient and scalable mutation analysis for Java

Major, a framework for mutation analysis and fault seeding, provides a compiler-integrated mu- tator and a mutation analyzer for JUnit tests and features its own domain specific language and is de- signed to be highly configurable to support fundamental re- search in software engineering.

Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria

An experimental study investigating the effectiveness of two code-based test adequacy criteria for identifying sets of test cases that detect faults found that tests based respectively on control-flow and dataflow criteria are frequency complementary in their effectiveness.

EvoSuite: automatic test suite generation for object-oriented software

EvoSuite is presented, a tool that automatically generates test cases with assertions for classes written in Java code that applies a novel hybrid approach that generates and optimizes whole test suites towards satisfying a coverage criterion.

Extraction of bug localization benchmarks from history

iBUGS is presented, an approach that semiautomatically extracts benchmarks for bug localization from the history of a project and demonstrates the relevance of the dataset with a case study on the bug localization tool AMPLE.

Using Non-redundant Mutation Operators and Test Suite Prioritization to Achieve Efficient and Scalable Mutation Analysis

This paper investigates the decrease in generated mutants by applying a reduced, yet sufficient, set of mutants for replacing conditional and relational operators and demonstrates that the combination of non-redundant operators and prioritization leveraging information about the runtime and mutation coverage of tests reduces the total cost of mutation analysis further by as much as 65%.

A framework and methodology for studying the causes of software errors in programming systems

Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact

The infrastructure that is being designed and constructed to support controlled experimentation with testing and regression testing techniques is described and the impact that this infrastructure has had and can be expected to have.

Are mutants a valid substitute for real faults in software testing?

This paper investigates whether mutants are indeed a valid substitute for real faults, i.e., whether a test suite’s ability to detect mutants is correlated with its able to detect real faults that developers have fixed, and shows a statistically significant correlation between mutant detection and real fault detection, independently of code coverage.