On the Relevance of Cross-project Learning with Nearest Neighbours for Commit Message Generation

@article{Etemadi2020OnTR,
  title={On the Relevance of Cross-project Learning with Nearest Neighbours for Commit Message Generation},
  author={Khashayar Etemadi and Monperrus Martin},
  journal={Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops},
  year={2020}
}
  • K. Etemadi, Monperrus Martin
  • Published 27 June 2020
  • Computer Science
  • Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops
Commit messages play an important role in software maintenance and evolution. Nonetheless, developers often do not produce high-quality messages. A number of commit message generation methods have been proposed in recent years to address this problem. Some of these methods are based on neural machine translation (NMT) techniques. Studies show that the nearest neighbor algorithm (NNGen) outperforms existing NMT-based methods, although NNGen is simpler and faster than NMT. In this paper, we show… 
1 Citations

Figures and Tables from this paper

Learning to Describe Solutions for Bug Reports Based on Developer Discussions
TLDR
A corpus for this task is built using a novel technique for obtaining noisy supervision from repository changes linked to bug reports, with which it is found to form an ideal testbed for complex reasoning in long, bimodal dialogue context.

References

SHOWING 1-10 OF 19 REFERENCES
Neural-Machine-Translation-Based Commit Message Generation: How Far Are We?
TLDR
A simpler and faster approach is proposed, named NNGen (Nearest Neighbor Generator), to generate concise commit messages using the nearest neighbor algorithm, which is over 2,600 times faster than NMT, and outperforms NMT in terms of BLEU by 21%.
Automatically generating commit messages from diffs using neural machine translation
TLDR
This paper adapts Neural Machine Translation (NMT) to automatically "translate" diffs into commit messages and designed a quality-assurance filter to detect cases in which the algorithm is unable to produce good messages, and return a warning instead.
Generating Commit Messages from Diffs using Pointer-Generator Network
TLDR
PtrGNCMsg, a novel approach which is based on an improved sequence-to-sequence model with the pointer-generator network to translate code diffs into commit messages outperforms recent approaches based on neural machine translation, and first enables the prediction of OOV words.
Commit Message Generation for Source Code Changes
TLDR
This paper first extracts both code structure and code semantics from the source code changes, and then jointly model these two sources of information so as to better learn the representations of the code changes.
ATOM: Commit Message Generation Based on Abstract Syntax Tree and Hybrid Ranking
TLDR
A novel commit message generation model, named ATOM, which explicitly incorporates the abstract syntax tree for representing code changes and integrates both retrieved and generated messages through hybrid ranking, which demonstrates the effectiveness of ATOM in generating accurate code commit messages.
On Automatically Generating Commit Messages via Summarization of Source Code Changes
TLDR
An approach, coined as Change Scribe, which is designed to generate commit messages automatically from change sets by taking into account commit stereotype, the type of changes, as well as the impact set of the underlying changes.
Code Generation as a Dual Task of Code Summarization
TLDR
This paper proposes a dual training framework to train the two tasks simultaneously, and considers the dualities on probability and attention weights, and design corresponding regularization terms to constrain the duality.
Mining Version Control System for Automatically Generating Commit Comment
TLDR
This work proposes a method to automatically generate commit comment by reusing the existing comments in version control system by applying syntax, semantic, pre-syntax, and pre-semantic similarities to discover the similar commits from half a million commits, and recommends the reusable comments to the input commit from the ones of the similar commit.
A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes
TLDR
Quantitative and qualitative results showed that the proposed model to automatically describe changes introduced in the source code of a program using natural language can generate feasible and semantically sound descriptions not only in standard in-project settings, but also in a cross-project setting.
Automatic Generation of Commit Messages using Natural Language Processing
TLDR
This paper presents an approach of Natural Language Processing for generating automatic commit messages, based on code changes included in a changeset and simultaneously integrated to software usage library to read the document files of the software.
...
...