Learning from examples to improve code completion systems

  title={Learning from examples to improve code completion systems},
  author={Marcel Bruch and Monperrus Martin and Mira Mezini},
  booktitle={ESEC/FSE '09},
The suggestions made by current IDE's code completion features are based exclusively on static type system of the programming language. As a result, often proposals are made which are irrelevant for a particular working context. Also, these suggestions are ordered alphabetically rather than by their relevance in a particular context. In this paper, we present intelligent code completion systems that learn from existing code repositories. We have implemented three such systems, each using the… 

Figures from this paper

Function completion in the time of massive data: A code embedding perspective

This work presents a novel approach for improving current function-calls completion tools by learning from independent code repositories, using well-known natural language processing models that can learn vector representation of source code (code embeddings).

Combining Code Embedding with Static Analysis for Function-Call Completion

This work presents a novel approach for improving current function-calls completion tools by learning from independent code repositories, using well-known natural language processing models that can learn vector representation of source code (code embeddings).

An Empirical Study on Code Comment Completion

This large-scale study empirically assess how a simple n-gram model and the recently proposed Text-To-Text Transfer Transformer (T5) architecture can perform in autocompleting a code comment the developer is typing.

Statistical Approach to Increase Source Code Completion Accuracy

The extension of a typical code completion system that is language-agnostic, and thus, can be applied to any programming language, and achieves much more accurate results.

Active code completion

Active code completion is described, an architecture that allows library developers to introduce interactive and highly-specialized code generation interfaces, called palettes, directly into the editor, and one such system is designed, named Graphite, for the Eclipse Java development environment.

What Should I Code Now?

A plug-in for the Eclipse IDE, named Vertical Code Completion, was developed and applied over widely known Open Source systems, identifying that the approach could provide suggestions that would anticipate what a developer intends to code.

Adaptive Code Completion with Meta-learning

This work trains a base code model that is best able to learn semantic and structural information from context to improve predictions of unseen local tokens and proposes an adaptive code model leveraging meta-learning techniques.

Sequence Model Design for Code Completion in the Modern IDE

A novel design for predicting top-k next tokens is proposed that combines static analysis' ability to enumerate all valid keywords and in-scope identifiers with the ability of a language model to place a probability distribution over them and achieves state-of-art accuracy in source code modeling.

Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Models

A fusion ranking scheme is designed that can automatically identify the priority of the completion results and reorder the candidates from multiple code completion models, regardless of the type or the length of their completion results, and a new code completion evaluation metric, Benefit-Cost Ratio, is proposed.

Context-Sensitive Code Completion

This research aims to improve the current state of code completion systems in discovering APIs and proposed techniques have the potential to help developers to learn different aspects of APIs, thus ease software development, and improve the productivity of developers.



Using structural context to recommend source code examples

  • Reid HolmesG. Murphy
  • Computer Science
    Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005.
  • 2005
An approach for locating relevant code in an example repository that is based on heuristically matching the structure of the code under development to the example code, and the structural context needed to query the repository is extracted automatically from the code.

How Program History Can Improve Code Completion

  • R. RobbesMichele Lanza
  • Computer Science
    2008 23rd IEEE/ACM International Conference on Automated Software Engineering
  • 2008
A benchmark measuring the accuracy and usefulness of a code completion engine is defined and an alternative interface for completion tools is proposed, which helps improve the results offered by code completion tools.

Parseweb: a programmer assistant for reusing open source code on the web

An approach that takes queries of the form "Source object type → Destination object type" as input, and suggests relevant method-invocation sequences that can serve as solutions that yield the destination object from the source object given in the query is developed.

Automatic method completion

  • R. HillJoe Rideout
  • Computer Science
    Proceedings. 19th International Conference on Automated Software Engineering, 2004.
  • 2004
This work extends the idea of automatic completion to include completion of the body of a method by employing machine learning algorithms on the near duplicate code segments that frequently exist in large software projects.

XSnippet: mining For sample code

XSnippet is developed, a context-sensitive code assistant framework that allows developers to query a sample repository for code snippets that are relevant to the programming task at hand and provides better coverage of tasks and better rankings for best-fit snippets than other code assistant systems.

Integrating active information delivery and reuse repository systems

It is argued that this crucial barrier to reuse is overcome by integrating active information delivery, which presents information without explicit queries from the user, and reuse repository systems.

Using task context to improve programmer productivity

This paper presents a mechanism that captures, models, and persists the elements and relations relevant to a task, and reports a statistically significant improvement in the productivity of industry programmers who voluntarily used Mylar for their daily work.

Jungloid mining: helping to navigate the API jungle

Reuse of existing code from class libraries and frameworks is often difficult because APIs are complex and the client code required to use the APIs can be hard to write. We observed that a common

How are Java software developers using the Elipse IDE?

It is hoped this report provides a start in defining which in formation to collect and distribute on an on going basis to help improve Eclipse and other similar platforms and tools.

FrUiT: IDE support for framework understanding

FrUiT, an Eclipse plug-in that implements the use of data mining techniques to extract reuse patterns from existing framework instantiations is built and a first assessment by mining parts of the Eclipse framework is presented.