Learning natural coding conventions

@article{Allamanis2014LearningNC,
  title={Learning natural coding conventions},
  author={Miltiadis Allamanis and Earl T. Barr and Charles Sutton},
  journal={Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering},
  year={2014}
}
Every programmer has a characteristic style, ranging from preferences about identifier naming to preferences about object relationships and design patterns. Coding conventions define a consistent syntactic style, fostering readability and hence maintainability. When collaborating, programmers strive to obey a project’s coding conventions. However, one third of reviews of changes contain feedback about coding conventions, indicating that programmers do not always follow them and that project… 
outputs Investigating naming convention adherence in Java
TLDR
NOMINAL is presented, a naming convention checking library for Java that allows the declarative specification of conventions regarding typography and the use of abbreviations and phrases and investigates the extent to which developers follow conventions.
Investigating naming convention adherence in Java references Conference or Workshop Item
TLDR
NOMINAL is presented, a naming convention checking library for Java that allows the declarative specification of conventions regarding typography and the use of abbreviations and phrases and investigates the extent to which developers follow conventions.
Investigating Naming Convention Adherence in Java
TLDR
NOMINAL is presented, a naming convention checking library for Java that allows the declarative specification of conventions regarding typography and the use of abbreviations and phrases and investigates the extent to which developers follow conventions.
Investigating naming convention adherence in Java references
TLDR
Nominal is presented, a naming convention checking library for Java that allows the declarative specification of conventions regarding typography and the use of abbreviations and phrases and investigates the extent to which developers follow conventions.
Do People Prefer "Natural" code?
TLDR
It is found that transformations to Java and Python expressions in a distinct test corpus generally produce program structures that are less common in practice, supporting the theory that the high repetitiveness in code is a matter of deliberate preference.
Deep Generation of Coq Lemma Names Using Elaborated Terms
TLDR
These models, based on multi-input neural networks, are the first to leverage syntactic and semantic information from Coq ’s lexer, parser, and kernel for naming; the key insight is that learning from elaborated terms can substantially boost model performance.
The Open University ’ s repository of research publications and other research outputs Investigating naming convention adherence in Java
TLDR
NOMINAL is presented, a naming convention checking library for Java that allows the declarative specification of conventions regarding typography and the use of abbreviations and phrases and investigates the extent to which developers follow conventions.
The Open University ’ s repository of research publications and other research outputs Investigating naming convention adherence in Java references Conference Item
TLDR
NOMINAL is presented, a naming convention checking library for Java that allows the declarative specification of conventions regarding typography and the use of abbreviations and phrases and investigates the extent to which developers follow conventions.
Style-Analyzer: Fixing Code Style Inconsistencies with Interpretable Unsupervised Algorithms
Source code reviews are manual, time-consuming, and expensive. Human involvement should be focused on analyzing the most relevant aspects of the program, such as logic and maintainability, rather
Naming Practices in Java Projects: An Empirical Study
TLDR
A study to explore the naming practices of Java programmers and analyzed 1,421,607 identifier names from 40 open-source Java projects and categorized these names into eight naming practices, highlighting in which contexts identifiers following each naming practice tend to appear more regularly.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 319 REFERENCES
Concise and Consistent Naming
TLDR
This paper renders adequate identifier naming far more precisely a formal model, based on bijective mappings between concepts and names, provides a solid foundation for the definition of precise rules for concise and consistent naming.
Concise and consistent naming
TLDR
This paper renders adequate identifier naming far more precisely a formal model, based on bijective mappings between concepts and names, provides a solid foundation for the definition of precise rules for concise and consistent naming.
Emergent, crowd-scale programming practice in the IDE
TLDR
This work built Codex, a knowledge base that records common practice for the Ruby programming language by indexing over three million lines of popular code, and suggests that operationalizing practice-driven knowledge in structured domains such as programming can enable a new class of user interfaces.
To camelcase or under_score
TLDR
An empirical study of 135 programmers and non-programmers was conducted to better understand the impact of identifier style on code readability, and results indicate that camel casing leads to higher accuracy among all subjects regardless of training.
Technical Report: Towards a Universal Code Formatter through Machine Learning
TLDR
A code formatter called CodeBuff is introduced that uses machine learning to abstract formatting rules from a representative corpus, using a carefully designed feature set and is grammar invariant for a given language.
Syntactic Identifier Conciseness and Consistency
TLDR
Using a pool of 48 million lines of code, experiments with the resulting syntactic rules for concise and consistent naming illustrate that violations of the syntactic pattern exist and two case studies show that three quarters of the violations uncovered are "real" and would be identified using a concept mapping.
Learning Python Code Suggestion with a Sparse Pointer Network
TLDR
A neural language model with a sparse pointer network aimed at capturing very long range dependencies and a qualitative analysis shows this model indeed captures interesting long-range dependencies, like referring to a class member defined over 60 tokens in the past.
On the "naturalness" of buggy code
TLDR
It is found that code with bugs tends to be more entropic (i.e. unnatural), becoming less so as bugs are fixed, suggesting that entropy may be a valid, simple way to complement the effectiveness of PMD or FindBugs, and that search-based bug-fixing methods may benefit from using entropy both for fault-localization and searching for fixes.
Linguistic antipatterns: what they are and how developers perceive them
TLDR
This work identifies recurring poor practices related to inconsistencies among the naming, documentation, and implementation of an entity—called Linguistic Antipatterns (LAs)—that may impair program understanding and identifies a subset of LAs which were universally agreed upon as being problematic.
A New Family of Software Anti-patterns: Linguistic Anti-patterns
TLDR
The definition of software linguistic antipatterns is introduced, a first catalogue of one family of them is provided, and a detector prototype for Java programs is proposed called LAPD (Linguistic Anti-Pattern Detector), and a study investigating the presence of linguistic antip atterns in four Java software projects is reported.
...
1
2
3
4
5
...