Evaluating SZZ Implementations Through a Developer-Informed Oracle

@article{Rosa2021EvaluatingSI,
  title={Evaluating SZZ Implementations Through a Developer-Informed Oracle},
  author={Giovanni Rosa and Luca Pascarella and Simone Scalabrino and Rosalia Tufano and Gabriele Bavota and Michele Lanza and Rocco Oliveto},
  journal={2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)},
  year={2021},
  pages={436-447}
}
The SZZ algorithm for identifying bug-inducing changes has been widely used to evaluate defect prediction techniques and to empirically investigate when, how, and by whom bugs are introduced. Over the years, researchers have proposed several heuristics to improve the SZZ accuracy, providing various implementations of SZZ. However, fairly evaluating those implementations on a reliable oracle is an open problem: SZZ evaluations usually rely on (i) the manual analysis of the SZZ output to classify… 

Figures and Tables from this paper

PR-SZZ: How pull requests can support the tracing of defects in software repositories

  • P. BludauA. Pretschner
  • Computer Science
    2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)
  • 2022
TLDR
An updated version of SZZ utilizing pull requests, which manages to reduce the false-positives and increase precision by on average 16 percentage points in comparison to existing approaches.

Problems with SZZ and features: An empirical study of the state of practice of defect prediction data collection

TLDR
An empirical analysis of the defect labels created with the SZZ algorithm and the impact of commonly used features on results found that only half of the bug fixing commits determined by SZZ are actually bug fixing.

Reducing the search space of bug inducing commits using failure coverage

TLDR
It is shown that filtering commits using the coverage of the bug revealing test cases can effectively reduce the search space for both bisection and SZZ-like blame models by 87.6% and 27.9%, respectively, significantly reducing the cost of BIC retrieval.

SZZ in the time of Pull Requests

TLDR
This study conducts an in-depth investigation on the reliability and performance of SZZ in the multi-commit model and devise a second dataset that is more extensive and directly created by developers as well as Quality Assurance engineers of Mozilla.

Fast Changeset-based Bug Localization with BERT

TLDR
This paper describes how BERT can be made fast enough to be applicable to changeset-based bug localization and explores several design decisions in using BERT for this purpose, including how best to encode changesets and how to match bug reports to individual changes for improved accuracy.

V-SZZ: Automatic Identification of Version Ranges Affected by CVE Vulnerabilities

TLDR
This study proposes an approach based on an improved SZZ algorithm to refine software versions affected by CVE vulnerabilities, which leverages the line mapping algorithms to identify the earliest commit that modified the vulnerable lines, and considers these commits to be the vulnerability-inducing commits.

The Ghost Commit Problem When Identifying Fix-Inducing Changes: An Empirical Study of Apache Projects

TLDR
The results suggest that the next generation of SZZ improvements should be language-aware to connect ghost commits to implicated and defect-fixing commits, and promising directions for mitigation strategies to address each type of ghost commit are discussed.

Are automated static analysis tools worth it? An investigation into relative warning density and external software quality

TLDR
This article investigates the relationship between ASAT warnings emitted by PMD on defects per change and per file, and investigates whether files that induce a defect have more static analysis warnings than the rest of the project.

Regularity or Anomaly? On The Use of Anomaly Detection for Fine-Grained Just-in-Time Defect Prediction

TLDR
An empirical investigation on 32 open-source projects, designing and evaluating three anomaly detection methods for fine-grained just-in-time defect prediction and results are negative because anomaly Detection methods, taken alone, do not overcome the prediction performance of existing machine learning solutions.

References

SHOWING 1-10 OF 57 REFERENCES

A Framework for Evaluating the Results of the SZZ Approach for Identifying Bug-Introducing Changes

TLDR
The proposed framework provides a systematic mean for evaluating the data that is generated by a given SZZ implementation and finds that current SZZ implementations still lack mechanisms to accurately identify bug-introducing changes.

SZZ unleashed: an open implementation of the SZZ algorithm - featuring example usage in a study of just-in-time bug prediction for the Jenkins project

TLDR
SZZ Unleashed is presented, an open implementation of the SZZ algorithm for git repositories, and an illustrative study on just-in-time bug prediction is concluded.

Revisiting and Improving SZZ Implementations

TLDR
By preprocessing the dataset that is used as input by SZZ, the accuracy of SZZ may be considerably improved, for example, SZZ implementations are approximately 40% more accurate if only valid bug-fix lines are used as the input for SZZ.

SZZ revisited: verifying when changes induce fixes

TLDR
Improvements to the SZZ algorithm are outlined, including replacing annotation graphs with line-number maps that track unique source lines as they change over the lifetime of the software; and DiffJ, a Java syntax-aware diff tool, is used to ignore comments and formatting changes in the source.

The impact of refactoring changes on the SZZ algorithm: An empirical study

TLDR
This paper empirically investigates how refactorings impact both the input (bug-fix changes) and the output (bugs) of the SZZ algorithm and incorporates the refactoring-detection tool in the Refactoring Aware SZZ Implementation (RA-SZZ).

An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation

TLDR
An empirical study to assess the feasibility of using Neural Machine Translation techniques for learning bug-fixing patches for real defects finds that such a model is able to fix thousands of unique buggy methods in the wild.

Automatic Identification of Bug-Introducing Changes

TLDR
This paper presents algorithms to automatically and accurately identify bug-introducing changes and removes false positives and false negatives by using annotation graphs, by ignoring non-semantic source code changes, and outlier fixes.

Identifying bug-inducing changes for code additions

TLDR
The original SZZ algorithm is improved by proposing a way to link the code additions in a fixing change to a list of candidate inducing changes, which works well for linking code additions with previous changes, although it still produces many false positives.

Locus: Locating bugs from software changes

TLDR
An IR-based approach Locus is proposed to locate bugs using software changes, which offer finer granularity than files and provide important contextual clues for bug-fixing, and it is shown that Locus outperforms existing techniques at the source file level localization significantly.

How bugs are born: a model to identify how bugs are introduced in software components

TLDR
A model for defining criteria to identify the first snapshot of an evolving software system that exhibits a bug is proposed, based on the perfect test idea, and shows empirical evidence that the prevalent assumption, “a bug was introduced by the lines of code that were modified to fix it”, is just one case of how bugs are introduced in a software system.
...