• Corpus ID: 222132891

Deep Just-In-Time Inconsistency Detection Between Comments and Source Code

@inproceedings{Panthaplackel2021DeepJI,
  title={Deep Just-In-Time Inconsistency Detection Between Comments and Source Code},
  author={Sheena Panthaplackel and Junyi Jessy Li and Milo{\vs} Gligori{\'c} and Raymond J. Mooney},
  booktitle={AAAI},
  year={2021}
}
Natural language comments convey key aspects of source code such as implementation, usage, and pre- and post-conditions. Failure to update comments accordingly when the corresponding code is modified introduces inconsistencies, which is known to lead to confusion and software bugs. In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code, in order to catch potential inconsistencies just-in-time, i.e., before they are… 
HatCUP: Hybrid Analysis and Attention based Just-In-Time Comment Updating
TLDR
Instead of directly generating new comments, HatCUP proposes a new edit or non-edit mechanism to mimic human editing behavior, by generating a sequence of edit actions and constructing a modified RNN model to integrate newly developed components.
Impact of Evaluation Methodologies on Code Summarization
TLDR
The time-segmented evaluation methodology is introduced, which is novel to the code summarization research community, and compared with the mixed-project and cross-project methodologies that have been commonly used and shows that different methodologies lead to conflicting evaluation results.
Evaluation Methodologies for Code Learning Tasks
TLDR
A novel time-segmented evaluation methodology is formalized, as well as the two methodologies commonly used in the literature: mixed-project and cross-project, and it is argued that time- Segmented methodology is the most realistic.
On the Evaluation of Neural Code Summarization
TLDR
A systematic and in-depth analysis of 5 state-of-the-art neural code summarization models on 6 widely used BLEU variants, 4 pre-processing operations and their combinations, and 3 widely used datasets shows that some important factors have a great influence on the model evaluation, especially on the performance of models and the ranking among the models.
Program Synthesis with Large Language Models
TLDR
The limits of the current generation of large language models for program synthesis in general purpose programming languages are explored, finding that even the best models are generally unable to predict the output of a program given a specific input.

References

SHOWING 1-10 OF 52 REFERENCES
Word Embeddings for Comment Coherence
TLDR
This work studies the problem of detecting a lack of coherence between comments and source code by exploiting Word Embeddings (WEs), and presents four models based on THE AUTHORS and tested these models using six different THEY variants through an experiment conducted on a publicly available dataset.
/*icomment: bugs or bad comments?*/
TLDR
iComment automatically extracts 1832 rules from comments with 90.8-100% accuracy and detects 60 comment-code inconsistencies, 33 newbugs and 27 bad comments, in the latest versions of the four programs.
Detecting Code Comment Inconsistency using Siamese Recurrent Network
TLDR
Siamese recurrent network is proposed which uses word tokens in codes and comments as well as their sequences in corresponding codes or comments to solve inconsistencies in code comment inconsistency.
Automatic detection of outdated comments in open source Java projects
TLDR
DocRevise is presented, a tool that can automatically detect outdated Javadoc comments of open source Java projects at a fine-grain level and can assist developers to locate outdated comments in prior versions of the existing projects.
Analyzing the co-evolution of comments and source code
TLDR
An approach to associate comments with source code entities to track their co-evolution over multiple versions is presented and enables a quantitative assessment of the commenting process in a software system.
Learning to Update Natural Language Comments Based on Code Changes
TLDR
This work proposes an approach that learns to correlate changes across two distinct language representations, to generate a sequence of edits that are applied to the existing comment to reflect the source code modifications.
Automatic Detection of Outdated Comments During Code Changes
TLDR
Experimental results indicate that the proposed machine learning based method for detecting the comments that should be changed during code changes can help developers to discover outdated comments in historical versions of existing projects.
A Large-Scale Empirical Study on Code-Comment Inconsistencies
TLDR
The largest study at date investigating how code and comments co-evolve is presented, performed by mining 1.3 Billion AST-level changes from the complete history of 1,500 systems to define a taxonomy of code-comment inconsistencies fixed by developers.
Detecting fragile comments
TLDR
A new rule-based approach to detect fragile comments, called Fraco, which performed with near-optimal precision and recall on most components of the evaluation data set, and generally outperformed the baseline Eclipse feature.
Do Code and Comments Co-Evolve? On the Relation between Source Code and Comment Changes
TLDR
An approach to map code and comments to observe their co-evolution over multiple versions is described and shows that newly added code - despite its growth rate - barely gets commented.
...
1
2
3
4
5
...