An investigation of misunderstanding code patterns in C open-source software projects

  title={An investigation of misunderstanding code patterns in C open-source software projects},
  author={Fl{\'a}vio M. Medeiros and Gabriel Ferreira Brito Lima and Guilherme Amaral and Sven Apel and Christian K{\"a}stner and M{\'a}rcio Ribeiro and Rohit Gheyi},
  journal={Empirical Software Engineering},
Maintenance consumes 40% to 80% of software development costs. So, it is essential to write source code that is easy to understand to reduce the costs with maintenance. Improving code understanding is important because developers often mistake the meaning of code, and misjudge the program behavior, which can lead to errors. There are patterns in source code, such as operator precedence, and comma operator, that have been shown to influence code understanding negatively. Despite initial results… 

Recommending Code Understandability Improvements Based on Code Reviews

  • Delano Oliveira
  • Computer Science
    2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW)
  • 2021
This paper is an early research proposal to recommend code understandability improvements based on code reviewer knowledge, which comprises a dataset of code understandable improvements extracted from code reviews.

Understanding large-scale software systems - structure and flows

It is not possible to comprehend large systems at the same level as comprehending code, and the interviews demonstrate that system comprehension is largely detached from code and programming language, and includes scope that is not captured in the code.

Evaluating Code Readability and Legibility: An Examination of Human-centric Studies

This paper model program comprehension as a learning activity by adapting a preexisting learning taxonomy, which indicates that some competencies, e.g., tracing, are often exercised in these evaluations whereas others, E.g, relating similar code snippets, are rarely targeted.

Thinking aloud about confusing code: a qualitative investigation of program comprehension and atoms of confusion

It is argued that thinking of confusion as an atomic construct may pose challenges to formulating new candidates for atoms of confusion, and questioned whether hand-evaluation correctness is, itself, a sufficient instrument to study program comprehension.

Atoms of Confusion: The Eyes Do Not Lie

The present study evaluates whether developers misunderstand the code in the presence of atoms of confusion with an eye tracker, and confirms that atoms hinder developers' performance and comprehension.

Atoms of Confusion in Java

  • C. LanghoutM. Aniche
  • Computer Science
    2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC)
  • 2021
The results show that participants are 2.7 up to 56 times more likely to make mistakes in code snippets affected by 7 out of the 14 studied atoms of confusion, and when faced with both versions of the code snippets, participants perceived the version affected by the atom of confusion to be more confusing and/or less readable in 10 out ofThe 14 studied molecules.

Evaluating Atoms of Confusion in the Context of Code Reviews

An exploratory case study to provide a deeper understanding of atoms of confusion, and observes that statistical analysis did not show any relationship between atoms of confused and presence of confusion comments in code reviews, and found evidence that atom of confusion are mostly not being removed in pull requests.

Thinking Aloud about Confusing Code

Atoms of confusion are small patterns of code that have been empirically validated to be difficult to hand-evaluate by programmers. Previous research focused on defining and quantifying this

Open Source Software Development Challenges

This study reviewed the selected 172 studies according to some criteria that used the GitHub dataset as a data source and classified them within the scope of OSS development challenges thanks to the information they extract from the metadata of studies.

A Systematic Mapping of Software Engineering Challenges: GHTorrent Case

The 172 studies that use GHTorrent as a data source were categorized within the scope of software engineering challenges and a systematic mapping study was carried out, and the pros and cons of the dataset have been indicated.



Investigating Misunderstanding Code Patterns in C Open-Source Software Projects (Replication Package)

The study shows that according to developers only some patterns considered previously by researchers may cause misunderstandings, and this results complement previous studies by taking the perception of developers into account.

Understanding misunderstandings in source code

To identify code patterns that may confuse programmers, a preliminary set of `atoms of confusion' from known confusing code is extracted and it is shown that these code patterns can lead to a significantly increased rate of misunderstanding versus equivalent code without the patterns.

An empirical study of goto in C code from GitHub repositories

It is concluded that developers limit themselves to using goto appropriately in most cases, and not in an unrestricted manner like Dijkstra feared, thus suggesting that goto does not appear to be harmful in practice.

Prevalence of Confusing Code in Software Projects: Atoms of Confusion in the Wild

This work uses a corpus of 14 of the most popular and influential open source C and C++ projects to measure the prevalence and significance of small confusing patterns, demonstrating that beyond simple misunderstanding in the lab setting, atoms of confusion are both prevalent - occurring often in real projects, and meaningful - being removed by bug-fix commits at an elevated rate.

Modern code reviews in open-source projects: which problems do they fix?

Surprising results show that the types of changes due to the MCR process in OSS are strikingly similar to those in the industry and academic systems from literature, featuring the similar 75:25 ratio of maintainability-related to functional problems.

Discipline Matters: Refactoring of Preprocessor Directives in the #ifdef Hell

A catalogue of refactoring is proposed and the number of application possibilities of the refactorings in practice, the opinion of developers about the usefulness of theRefactorings, and whether the refactings preserve behavior are evaluated.

Investigating preprocessor-based syntax errors

A technique based on a variability-aware parser to find syntax errors in releases and commits of program families and classify the syntax errors into 6 different categories may guide developers to avoid them during development.

The Discipline of Preprocessor-Based Annotations - Does #ifdef TAG n't #endif Matter

A mixed-method research involving two studies into whether developers care about the discipline of preprocessor-based annotations and whether they can really influence on maintenance tasks concludes that undisciplined annotations should not be neglected.

Analyzing the discipline of preprocessor annotations in 30 million lines of C code

By means of an analysis of 40 medium-sized to large-sized C programs, it is shown empirically that programmers use cpp mostly in a disciplined way: about 84% of all annotations respect the underlying source-code structure.

Code quality analysis in open source software development

It is determined that, up to a certain extent, the average component size of an application is negatively related to the user satisfaction for this application.