Improving automated source code summarization via an eye-tracking study of programmers

@article{Rodeghero2014ImprovingAS,
  title={Improving automated source code summarization via an eye-tracking study of programmers},
  author={Paige Rodeghero and Collin McMillan and Paul W. McBurney and Nigel Bosch and Sidney K. D’Mello},
  journal={Proceedings of the 36th International Conference on Software Engineering},
  year={2014}
}
Source Code Summarization is an emerging technology for automatically generating brief descriptions of code. Current summarization techniques work by selecting a subset of the statements and keywords from the code, and then including information from those statements and keywords in the summary. The quality of the summary depends heavily on the process of selecting the subset: a high-quality selection would contain the same statements and keywords that a programmer would choose. Unfortunately… 

Figures and Tables from this paper

Automatic Documentation Generation via Source Code Summarization
  • P. McBurney
  • Computer Science
    2015 IEEE/ACM 37th IEEE International Conference on Software Engineering
  • 2015
TLDR
This paper proposes three specific research objectives to improving automatic documentation generation, including studying the similarity between source code and summary, and studying whether or not including contextual information about source code improves summary quality.
A Code Summarization Approach for Object Oriented Programs
  • A. H. Mohsin, M. Hammad
  • Computer Science
    2019 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT)
  • 2019
TLDR
A code summarization framework is proposed to document the source code based on mapping the target source code segments to an XML representation and generated results showed that the proposed approach is useful in understanding the different structural aspects of the sourcecode.
Autofolding for Source Code Summarization
TLDR
The autofolding problem, which is to automatically create a code summary by folding less informative code regions, is introduced by formulating the problem as a sequence of AST folding decisions, leveraging a scoped topic model for code tokens.
Automatic Code Summarization: A Systematic Literature Review
TLDR
A systematic literature review over the automatic source code summarization field provides an overview of the state of the art, and sheds light on future research directions of program comprehension and comment generation.
A Survey on Research of Code Comment
TLDR
This paper has compiled the relevant research on code comments so far, mainly including four aspects: automatic generation of code comment generation, consistency of code comments, classification of codeComments, and quality evaluation of code commenting.
A Human Study of Comprehension and Code Summarization
TLDR
A human study involving both university students and professional developers found that participants performed significantly better using human-written summaries versus machine-generated summaries, but found no evidence to support that participants perceive human- and machine- generated summaries to have different qualities.
Explorer Autofolding for Source Code Summarization
TLDR
The autofolding problem, which is to automatically create a code summary by folding less informative code regions, is introduced by formulating the problem as a sequence of AST folding decisions, leveraging a scoped topic model for code tokens.
Selection and presentation practices for code example summarization
TLDR
A study to discover how code can be summarized and why and a list of practices followed by the participants to summarize code examples and propose empirically-supported hypotheses justifying the use of specific practices.
Detecting Important Terms in Source Code for Program Comprehension
TLDR
A unified prediction model based on a Naive Bayes algorithm is proposed that predicts the top quartile of the most-important terms with approximately 50% precision and recall, outperforming other popular techniques.
Toward Automatic Summarization of Arbitrary Java Statements for Novice Programmers
  • Mohammed Hassan, Emily Hill
  • Computer Science
    2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)
  • 2018
TLDR
A novel technique towards automatically generating comments for Java statements suitable for novice programmers that goes beyond existing approaches to method summarization to meet the needs of novices and also leverages API documentation when available is presented.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 66 REFERENCES
On the Use of Automated Text Summarization Techniques for Summarizing Source Code
TLDR
The paper presents a solution which mitigates the two approaches, i.e., short and accurate textual descriptions that illustrate the software entities without having to read the details of the implementation.
Automatic generation of natural language summaries for Java classes
TLDR
This paper presents a technique to automatically generate human readable summaries for Java classes, assuming no documentation exists, and determines that they are readable and understandable, they do not include extraneous information, and, in most cases, they are not missing essential information.
Towards automatically generating summary comments for Java methods
TLDR
A novel technique to automatically generate descriptive summary comments for Java methods is presented, given the signature and body of a method, which identifies the content for the summary and generates natural language text that summarizes the method's overall actions.
Evaluating source code summarization techniques: Replication and expansion
TLDR
A new topic modeling based approach to source code summarization is proposed, and via a study of 14 developers, source code summaries generated using the proposed technique are evaluated.
Automatically documenting program changes
TLDR
An automatic technique for synthesizing succinct human-readable documentation for arbitrary program differences is presented, based on a combination of symbolic execution and a novel approach to code summarization, that is suitable for supplementing or replacing 89% of existing log messages that directly describe a code change.
Automatically detecting and describing high level actions within methods
TLDR
This work presents an automatic technique for identifying code fragments that implement high level abstractions of actions and expressing them as a natural language description andJudgements of the generated descriptions by 15 experienced Java programmers strongly suggest that indeed they view the fragments that the authors identify as representing high level actions and their synthesized descriptions accurately express the abstraction.
Quality analysis of source code comments
TLDR
A first detailed approach for quality analysis and assessment of code comments is presented, which provides a model for comment quality which is based on different comment categories and is used on Java and C/C++ programs.
Generating Parameter Comments and Integrating with Method Summaries
TLDR
A novel technique to automatically generate descriptive comments for parameters of Java methods that provide a high-level overview of the role of a parameter in a method is described.
Searching and skimming: An exploratory study
TLDR
A formative study in which programmers were asked to perform corrective tasks to a system they were initially unfamiliar with, focused specifically on how programmers decide what to search for, and how they decide which results are relevant to their task.
Code fragment summarization
TLDR
This algorithm based on machine learning could approximate summaries in an oracle manually generated by humans with a precision of 0.71, promising as summaries with this level of precision achieved the same level of agreement as human annotators with each other.
...
1
2
3
4
5
...