On Automatically Generating Commit Messages via Summarization of Source Code Changes

@article{CortesCoy2014OnAG,
  title={On Automatically Generating Commit Messages via Summarization of Source Code Changes},
  author={Luis Fernando Cortes-Coy and M. V{\'a}squez and Jairo Aponte and D. Poshyvanyk},
  journal={2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation},
  year={2014},
  pages={275-284}
}
Although version control systems allow developers to describe and explain the rationale behind code changes in commit messages, the state of practice indicates that most of the time such commit messages are either very short or even empty. In fact, in a recent study of 23K+ Java projects it has been found that only 10% of the messages are descriptive and over 66% of those messages contained fewer words as compared to a typical English sentence (i.e., 15-20 words). However, accurate and complete… Expand
On Automatic Summarization of What and Why Information in Source Code Changes
TLDR
An approach is presented that can automatically generate the commit messages related to the code changes, including not only what have been changed but also why they were changed, using method stereotypes and the type of changes to generate commit messages. Expand
Learning Human-Written Commit Messages to Document Code Changes
TLDR
The results indicate that the recommended messages by ChangeDoc are very good approximations of the ones written by developers and often include important intent information that is not included in the messages generated by other tools. Expand
Automatic Generation of Commit Messages using Natural Language Processing
Software development requires Version Control System to manage and manipulate the changes made to source code. When a change is done in the file, related information is updated as commit message.Expand
Mining Version Control System for Automatically Generating Commit Comment
TLDR
This work proposes a method to automatically generate commit comment by reusing the existing comments in version control system by applying syntax, semantic, pre-syntax, and pre-semantic similarities to discover the similar commits from half a million commits, and recommends the reusable comments to the input commit from the ones of the similar commit. Expand
Commit Message Generation for Source Code Changes
TLDR
This paper first extracts both code structure and code semantics from the source code changes, and then jointly model these two sources of information so as to better learn the representations of the code changes. Expand
Automatically generating commit messages from diffs using neural machine translation
TLDR
This paper adapts Neural Machine Translation (NMT) to automatically "translate" diffs into commit messages and designed a quality-assurance filter to detect cases in which the algorithm is unable to produce good messages, and return a warning instead. Expand
Commit Message Generation from Code Differences using Hidden Markov Models
TLDR
Inspired by the traditional solution to sequence modeling; Hidden Markov Models, it is shown that HMMs outperforms sequence-to-sequence models without outputting the same exact message of the nearest code diff, and experiments show an enhancement of 4% against sequence to sequence models. Expand
On the Evaluation of Commit Message Generation Models: An Experimental Study
TLDR
A systematic and in-depth analysis of the state-of-the-art models and datasets for automatic commit message generation and conducts a human evaluation to find the BLEU metric that best correlates with the human scores for the task. Expand
Quality Assurance for Automated Commit Message Generation
  • Bei Wang, Meng Yan, +4 authors Dan Yang
  • Computer Science
  • 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)
  • 2021
TLDR
An automated Quality A ssurance framework for commit message generation (QAcom), which can assure the quality of generated commit messages by automatically filtering out theSemantically-irrelevant generated messages and preserving the semantically-relevant ones as many as possible. Expand
RCLinker: Automated Linking of Issue Reports and Commits Leveraging Rich Contextual Information
TLDR
This work relies on a recently proposed tool, namely Change Scribe, which generates commit messages containing rich contextual information by using a number of code summarization techniques, and extracts features from these automatically generated commit messages and bug reports and inputs them into a classification technique that creates a discriminative model used to predict if a link exists between a commit message and a bug report. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 36 REFERENCES
Automatically documenting program changes
TLDR
An automatic technique for synthesizing succinct human-readable documentation for arbitrary program differences is presented, based on a combination of symbolic execution and a novel approach to code summarization, that is suitable for supplementing or replacing 89% of existing log messages that directly describe a code change. Expand
Automatic documentation generation via source code summarization of method context
TLDR
This paper proposes a technique that includes this context by analyzing how the Java methods are invoked, and finds that programmers benefit from the generated documentation because it includes context information. Expand
Automatic generation of natural language summaries for Java classes
TLDR
This paper presents a technique to automatically generate human readable summaries for Java classes, assuming no documentation exists, and determines that they are readable and understandable, they do not include extraneous information, and, in most cases, they are not missing essential information. Expand
On the Use of Automated Text Summarization Techniques for Summarizing Source Code
TLDR
The paper presents a solution which mitigates the two approaches, i.e., short and accurate textual descriptions that illustrate the software entities without having to read the details of the implementation. Expand
Why did this code change?
TLDR
This work proposes the use of multi-document summarization techniques to generate a concise natural language description of why code changed so that a developer can choose the right course of action. Expand
On the nature of commits
  • L. Hattori, M. Lanza
  • Computer Science
  • 2008 23rd IEEE/ACM International Conference on Automated Software Engineering - Workshops
  • 2008
TLDR
This paper defines the size of commits in terms of number of files, and then classify commits based on the content of their comments, using the history log of nine large open source projects to perform this study. Expand
What's a Typical Commit? A Characterization of Open Source Software Repositories
TLDR
The research examines the version histories of nine open source software systems to uncover trends and characteristics of how developers commit source code to version control systems and finds that approximately 75% of commits are quite small for the systems examined. Expand
Improving change descriptions with change contexts
TLDR
A technique for expressing changes that is fine-grained but preserves some structural aspects is presented, which enables more relevant and concise descriptions in terms of software types and programming activities. Expand
Towards automatically generating summary comments for Java methods
TLDR
A novel technique to automatically generate descriptive summary comments for Java methods is presented, given the signature and body of a method, which identifies the content for the summary and generates natural language text that summarizes the method's overall actions. Expand
Generating natural language summaries for crosscutting source code concerns
TLDR
This work introduces an automated approach that produces a natural language summary that describes both what the concern is and how the concerns are implemented, and presents the results of an experiment in which programmers were able to perform change tasks more efficiently and more easily with generated concern summaries. Expand
...
1
2
3
4
...