How Well Do Change Sequences Predict Defects? Sequence Learning from Software Changes

  title={How Well Do Change Sequences Predict Defects? Sequence Learning from Software Changes},
  author={Ming Wen and Rongxin Wu and S. C. Cheung},
  journal={IEEE Transactions on Software Engineering},
Software defect prediction, which aims to identify defective modules, can assist developers in finding bugs and prioritizing limited quality assurance resources. Various features to build defect prediction models have been proposed and evaluated. Among them, process metrics are one important category. Yet, existing process metrics are mainly encoded manually from change histories and ignore the sequential information arising from the changes during software evolution. Are the change sequences… 

On the Use of Deep Learning in Software Defect Prediction

Develop more comprehensive DL approaches that automatically capture the needed features; use diverse software artifacts other than source code; adopt data augmentation techniques to tackle the class imbalance problem; publish replication packages.

A Systematic Literature Review of Software Defect Prediction Using Deep Learning

A Systematic Literature Review (SLR) of software defect prediction using deep learning models focused on identifying the studies that use the semantics of the source code for improving defect prediction.

Dynamically Relative Position Encoding-Based Transformer for Automatic Code Edit

DTrans is designed with dynamically relative position encoding in the multi-head attention of Transformer, which can more accurately generate patches than the state-of-the-art methods and locate the lines to change with higher accuracy than the existing methods.

A Code Naturalness Based Defect Prediction Method at Slice Level

The experimental results show that the CNDePor method has significant advantages over the traditional defect prediction methods and method based on code naturalness, and own comparable performance and stronger interpretability than a state-of-the-art mothed based on deep learning.

An Empirical Examination of the Impact of Bias on Just-in-time Defect Prediction

The results highlight that dataset imbalanced in terms of commit characteristics can significantly impact prediction performance, and few-shot learning based techniques can help alleviate the situation.

Code Edit Recommendation Using a Recurrent Neural Network

A code edit recommendation method using a recurrent neural network (CERNN) that forms contexts that maintain the sequence of developers’ interactions to recommend files to edit and stops recommendations when the first recommendation becomes incorrect for the given evolution task.

On the Reproducibility and Replicability of Deep Learning in Software Engineering

It is urgent for the SE community to provide a long-lasting link to a high-quality reproduction package, enhance DL-based solution stability and convergence, and avoid performance sensitivity on different sampled data.

Deep Learning In Software Engineering

This dissertation exploits the tool of deep learning to automatically learn patterns discovered within previous software data and automatically apply those patterns to present day software development, to exemplify that the techniques presented provide a meaningful advancement to the field of software engineering and the automation of software development tasks.

On the Replicability and Reproducibility of Deep Learning in Software Engineering

It is urgent for the SE community to provide a long-lasting link to a high-quality reproduction package, enhance DL-based solution stability and convergence, and avoid performance sensitivity on different sampled data.

A Systematic Literature Review on the Use of Deep Learning in Software Engineering Research

A systematic literature review of research at the intersection of SE & DL, from its modern inception to the present, that delineates the foundations of DL techniques applied to SE research and highlights likely areas of fertile exploration for the future.



Automatically Learning Semantic Features for Defect Prediction

This paper proposes to leverage a powerful representation-learning algorithm, deep learning, to learn semantic representation of programs automatically from source code, using Deep Belief Network to automatically learn semantic features from token vectors extracted from programs' Abstract Syntax Trees.

Deep Learning for Just-in-Time Defect Prediction

An approach Deeper is proposed which leverages deep learning techniques to predict defect-prone changes by leveraging a deep belief network algorithm and a machine learning classifier is built on the selected features.

Defect prediction from static code features: current results, limitations, new approaches

It is hypothesized that the limits of the standard learning goal of maximizing area under the curve (AUC) of the probability of false alarms and probability of detection “AUC(pd, pf)” are reached, and certain widely used learners perform much worse than simple manual methods.

Predicting defect densities in source code files with decision tree learners

This work focuses on defect density prediction and presents an approach that applies a decision tree learner on evolution data extracted from the Mozilla open source web browser project, which includes different source code, modification, and defect measures computed from seven recent Mozilla releases.

Comparing fine-grained source code changes and code churn for bug prediction

This paper presents a series of experiments using different machine learning algorithms with a dataset from the Eclipse platform to empirically evaluate the performance of SCC and LM and shows that SCC outperforms LM for learning bug prediction models.

How, and why, process metrics are better

It is found that code metrics have high stasis; this leads to stagnation in the prediction models, leading to the same files being repeatedly predicted as defective; unfortunately, these recurringly defective files turn out to be comparatively less defect-dense.

Dictionary learning based software defect prediction

A cost-sensitive discriminative dictionary learning (CDDL) approach for software defect classification and prediction, which outperforms several representative state-of-the-art defect prediction methods.

Transfer defect learning

A state-of-the-art transfer learning approach is applied to make feature distributions in source and target projects similar, and a novel transfer defect learning approach, TCA+, is proposed, by extending TCA.

Sample-based software defect prediction with active and semi-supervised learning

This paper proposes a novel active semi-supervised learning method ACoForest which is able to sample the modules that are most helpful for learning a good prediction model and shows that the proposed methods are effective and have potential to be applied to industrial practice.

Online Defect Prediction for Imbalanced Data

This first study of applying change classification in practice identifies two issues in the prediction process, both of which contribute to the low prediction performance, and applies and adapt online change classification, resampling, and updatable classification techniques to improve the classification performance.