Online Defect Prediction for Imbalanced Data

@article{Tan2015OnlineDP,
  title={Online Defect Prediction for Imbalanced Data},
  author={Ming Tan and Lin Tan and Sashank Dara and Caleb Mayeux},
  journal={2015 IEEE/ACM 37th IEEE International Conference on Software Engineering},
  year={2015},
  volume={2},
  pages={99-108}
}
Many defect prediction techniques are proposed to improve software reliability. Change classification predicts defects at the change level, where a change is the modifications to one file in a commit. In this paper, we conduct the first study of applying change classification in practice. We identify two issues in the prediction process, both of which contribute to the low prediction performance. First, the data are imbalanced---there are much fewer buggy changes than clean changes. Second… CONTINUE READING
Highly Cited
This paper has 52 citations. REVIEW CITATIONS

7 Figures & Tables

Extracted Numerical Results

  • Our results show that these techniques improve the precision of change classification by 12.2-89.5% or 6.4--34.8 percentage points (pp.) on the seven projects.
  • Our results show that these techniques improve the precision of change classi.cation by 12.2-89.5% or 6.4 34.8 percentage points (pp.) on the seven projects.
  • We .nd that the precision is only 18.5%, which is signi.cantly lower than the precisions on open source projects [1], [17].
  • We found that the precision of time sensitive change classi.cation is only 18.5 59.9%, while the precision of cross-validation is 55.5 72.0% for the same data (details in Section VI-A).
  • These techniques have improved the precision of time sensitive change classi.cation by 12.2-89.5% or 6.4 34.8 percentage points (pp.) on the one proprietary project and six open source projects.
  • improvement on precision for Jackrabbit and 15.5 pp.
  • Among them, we select seven developers whose model(s) built by either resampling techniques or updatable classi.cation could achieve 100% precision on the test sets (Table IV) for this case study.
  • We pick the top 10 of the developers on the list; then we select the developers whose changes allow for at least two runs and the precision of the .rst run is higher than 60%.
  • The gap on precision is 7.6 37.0 percentage points (pp.) with an average of 18.4 pp.
  • Updatable classi.cation improves the precision of the baseline by 8.4 67.0% which is 3.8 17.3 pp.
  • For the other three projects, it reduces the precision by 3.4 pp.
  • These results show that resampling increases F1 by 2.2 417.2% over the basic online change classi.cation which is 0.5 30.5 pp., 13.9 pp. on average, for all seven projects; while updatable classi.cation improves F1 by 21.1 370.2%, which is 4.4 27.0 pp., 11.9 pp.
  • To .nd more predictable developers, we build prediction models for more developers in the propri­etary project and select seven developers on whose changes we can achieve 100% precision (Section IV).
  • We only select developers with 100% precision for the case study.
  • Our evaluation on one proprietary and six open source projects shows that both resampling tech­niques and updatable classi.cation improve the precision by 12.2-89.5% or 6.4 34.8 percentage points.

Topics

Statistics

01020302015201620172018
Citations per Year

52 Citations

Semantic Scholar estimates that this publication has 52 citations based on the available data.

See our FAQ for additional information.