To camelcase or under_score

@article{Binkley2009ToCO,
  title={To camelcase or under\_score},
  author={David W. Binkley and Marcia Davis and Dawn J Lawrie and Christopher Morrell},
  journal={2009 IEEE 17th International Conference on Program Comprehension},
  year={2009},
  pages={158-167}
}
Naming conventions are generally adopted in an effort to improve program comprehension. Two of the most popular conventions are alternatives for composing multi-word identifiers: the use of underscores and the use of camel casing. While most programmers have a personal opinion as to which style is better, empirical study forms a more appropriate basis for choosing between them. The central hypothesis considered herein is that identifier style affects the speed and accuracy of manipulating… 

Figures and Tables from this paper

Shorter identifier names take longer to comprehend
TLDR
The results of this study suggest that code is more difficult to comprehend when it contains only letters and abbreviations as identifier names and may help to save costs and improve software quality.
The impact of identifier style on effort and comprehension
TLDR
A family of studies investigating the impact of program identifier style on human comprehension is presented, finding that experienced software developers appear to be less affected by identifier style; however, beginners benefit from the use of camel casing with respect to accuracy and effort.
Learning natural coding conventions
TLDR
NATHURALIZE, a framework that learns the style of a codebase, and suggests revisions to improve stylistic consistency is presented, which builds on recent work in applying statistical natural language processing to source code.
Automatic and Accurate Expansion of Abbreviations in Parameters
TLDR
Evaluation results suggest that the proposed automatic approach to improve the accuracy of abbreviation expansion by exploiting the specific and fine-grained context can improve the precision from 26 to 95 percent and recall from26 to 65 percent compared against the state-of-the-art general purpose approach.
An Eye Tracking Study on camelCase and under_score Identifier Styles
TLDR
An empirical study to determine if identifier-naming conventions (i.e., camelCase and under_score) affect code comprehension is presented and results indicate no difference in accuracy between the two styles, but subjects recognize identifiers in the underscore style more quickly.
Towards a Model to Appraise and Suggest Identifier Names
  • Anthony S Peruma
  • Computer Science
    2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)
  • 2019
TLDR
This study aims to understand the motivations that drive developers to name and rename identifiers and the decisions they make in determining the name and proposes the development of a linguistic model that determines identifier names based on the behavior of the identifier.
A comprehensive model for code readability
TLDR
The results demonstrate that (1) textual features complement other features and (2) a model containing all the features achieves a significantly higher accuracy as compared with all the other state‐of‐the‐art models.
Women and men — Different but equal: On the impact of identifier style on source code reading
TLDR
It is found that the effort spent onwrong answers is significantly higher for female subjects and that there is an interaction between the effort that female subjects invested on wrong answers and their higher percentages of correct answers when compared to male subjects.
Normalizing source code vocabulary to support program comprehension and software quality
  • Latifa Guerrouj
  • Computer Science
    2013 35th International Conference on Software Engineering (ICSE)
  • 2013
TLDR
The evaluation shows that the contextual-aware techniques are accurate and efficient in terms of computation time than state of the art alternatives, and the findings reveal that feature location techniques can benefit from vocabulary normalization when no dynamic information is available.
Can Identifier Splitting Improve Open-Vocabulary Language Model of Code?
TLDR
This paper proposes to split identifiers in both constructing vocabulary and processing model inputs procedures, thus exploiting three different settings of applying identifier splitting to language models for the code completion task and finds that simply inserting identifier splitting into the pipeline hurts the model performance, while a hybrid strategy combining identifier splitting and the BPE algorithm can outperform the original open-vocabulary models on predicting identifiers.
...
...

References

SHOWING 1-10 OF 31 REFERENCES
Concise and consistent naming
TLDR
This paper renders adequate identifier naming far more precisely a formal model, based on bijective mappings between concepts and names, provides a solid foundation for the definition of precise rules for concise and consistent naming.
The Programmer's Lexicon, Volume I: The Verbs
TLDR
By analysing method implementations taken from a corpus of Java applications, an automatically generated, domain-neutral lexicon of verbs, similar to a natural language dictionary, that represents the common usages of many programmers is established.
Restructuring program identifier names
  • B. Caprile, P. Tonella
  • Computer Science
    Proceedings 2000 International Conference on Software Maintenance
  • 2000
TLDR
An approach for the restructuring of program identifier names is proposed, aimed at improving their meaningfulness, which considers two forms of standardization, associated respectively to the lexicon of the composing terms and to the syntax of their arrangement.
Cognitive Perspectives on the Role of Naming in Computer Programs
TLDR
Examination of ways in which human cognition is reflected in the text of computer programs focuses on naming: the assignment of identifying labels to programmatic constructs.
Gotos Considered Harmful and Other Programmers' Taboos
TLDR
A set of common programming taboos is examined, and both social aspects and technical reasons as to why these taboos have arisen are addressed.
Assessing the value of coding standards: An empirical study
TLDR
This paper describes two approaches to quantify the relation between rule violations and actual faults, and presents empirical data on this relation for the MISRA C 2004 standard on an industrial case study.
Eliminating go to's while preserving program structure
TLDR
It can be shown that the reducibility of a program's augmented flow graph, augmenting edges and all, is a necessary and sufficient condition for the eliminability of go to's from that program under the stricter rules.
Reexamining the word length effect in visual word recognition: New evidence from the English Lexicon Project
TLDR
The effect of word length (number of letters in a word) on lexical decision was reexamined using the English Lexicon Project and an unexpected pattern of results taking the form of a U-shaped curve was revealed.
Software technology maturation
TLDR
The conclusion is that technology maturation generally takes much longer than popularly thought, especially for major technology areas, and what actions can accelerate the maturation of technology.
...
...