On the Relevance of Cross-project Learning with Nearest Neighbours for Commit Message Generation

@article{Etemadi2020OnTR,
  title={On the Relevance of Cross-project Learning with Nearest Neighbours for Commit Message Generation},
  author={Khashayar Etemadi and Monperrus Martin},
  journal={Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops},
  year={2020}
}
  • K. Etemadi, Monperrus Martin
  • Published 2020
  • Computer Science
  • Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops
Commit messages play an important role in software maintenance and evolution. Nonetheless, developers often do not produce high-quality messages. A number of commit message generation methods have been proposed in recent years to address this problem. Some of these methods are based on neural machine translation (NMT) techniques. Studies show that the nearest neighbor algorithm (NNGen) outperforms existing NMT-based methods, although NNGen is simpler and faster than NMT. In this paper, we show… Expand
1 Citations

Figures and Tables from this paper

Learning to Describe Solutions for Bug Reports Based on Developer Discussions
TLDR
This work proposes generating a concise natural language description of the solution by synthesizing relevant content within the discussion, which encompasses both natural language and source code, and establishes baselines for generating solution descriptions. Expand

References

SHOWING 1-10 OF 24 REFERENCES
Neural-Machine-Translation-Based Commit Message Generation: How Far Are We?
TLDR
A simpler and faster approach is proposed, named NNGen (Nearest Neighbor Generator), to generate concise commit messages using the nearest neighbor algorithm, which is over 2,600 times faster than NMT, and outperforms NMT in terms of BLEU by 21%. Expand
Automatically generating commit messages from diffs using neural machine translation
TLDR
This paper adapts Neural Machine Translation (NMT) to automatically "translate" diffs into commit messages and designed a quality-assurance filter to detect cases in which the algorithm is unable to produce good messages, and return a warning instead. Expand
Generating Commit Messages from Diffs using Pointer-Generator Network
TLDR
PtrGNCMsg, a novel approach which is based on an improved sequence-to-sequence model with the pointer-generator network to translate code diffs into commit messages outperforms recent approaches based on neural machine translation, and first enables the prediction of OOV words. Expand
Commit Message Generation for Source Code Changes
TLDR
This paper first extracts both code structure and code semantics from the source code changes, and then jointly model these two sources of information so as to better learn the representations of the code changes. Expand
On Automatically Generating Commit Messages via Summarization of Source Code Changes
TLDR
An approach, coined as Change Scribe, which is designed to generate commit messages automatically from change sets by taking into account commit stereotype, the type of changes, as well as the impact set of the underlying changes. Expand
Code Generation as a Dual Task of Code Summarization
TLDR
This paper proposes a dual training framework to train the two tasks simultaneously, and considers the dualities on probability and attention weights, and design corresponding regularization terms to constrain the duality. Expand
Mining Version Control System for Automatically Generating Commit Comment
TLDR
This work proposes a method to automatically generate commit comment by reusing the existing comments in version control system by applying syntax, semantic, pre-syntax, and pre-semantic similarities to discover the similar commits from half a million commits, and recommends the reusable comments to the input commit from the ones of the similar commit. Expand
A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes
TLDR
Quantitative and qualitative results showed that the proposed model to automatically describe changes introduced in the source code of a program using natural language can generate feasible and semantically sound descriptions not only in standard in-project settings, but also in a cross-project setting. Expand
Automatic Generation of Commit Messages using Natural Language Processing
Software development requires Version Control System to manage and manipulate the changes made to source code. When a change is done in the file, related information is updated as commit message.Expand
On Automatic Summarization of What and Why Information in Source Code Changes
TLDR
An approach is presented that can automatically generate the commit messages related to the code changes, including not only what have been changed but also why they were changed, using method stereotypes and the type of changes to generate commit messages. Expand
...
1
2
3
...