Deep learning type inference

@article{Hellendoorn2018DeepLT,
  title={Deep learning type inference},
  author={Vincent J. Hellendoorn and Christian Bird and Earl T. Barr and Miltiadis Allamanis},
  journal={Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
  year={2018}
}
  • V. Hellendoorn, C. Bird, Miltiadis Allamanis
  • Published 26 October 2018
  • Computer Science
  • Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
Dynamically typed languages such as JavaScript and Python are increasingly popular, yet static typing has not been totally eclipsed: Python now supports type annotations and languages like TypeScript offer a middle-ground for JavaScript: a strict superset of JavaScript, to which it transpiles, coupled with a type system that permits partially typed programs. [] Key Method We propose DeepTyper, a deep learning model that understands which types naturally occur in certain contexts and relations and can provide…

Figures and Tables from this paper

Exploring Type Inference Techniques of Dynamically Typed Languages
TLDR
A new technique is proposed that considers the locally specific code tokens as the context to infer the types of code elements in JavaScript snippets and is 20-47% more accurate than the statically typed language-based techniques and 5–14 times faster than the deep learning techniques without sacrificing accuracy.
Advanced Graph-Based Deep Learning for Probabilistic Type Inference
TLDR
A range of graph neural network (GNN) models that operate on a novel type flow graph (TFG) representation, and it is shown that the best two GNN configurations for accuracy achieve a top-1 accuracy, outperforming the two most closely related deep learning type inference approaches from past work.
TypeWriter: neural type prediction with search-based validation
TLDR
TypeWriter is presented, the first combination of probabilistic type prediction with search-based refinement of predicted types, which can fully annotate between 14% to 44% of the files in a randomly selected corpus, while ensuring type correctness.
LAMBDANET: PROBABILISTIC TYPE INFERENCE
  • Computer Science
  • 2019
TLDR
This paper proposes a probabilistic type inference scheme for Typescript based on a graph neural network that can predict both standard types, like number or string, as well as user-defined types that have not been encountered during training.
Large Scale Generation of Labeled Type Data for Python
TLDR
Novel techniques for generating high quality types using 1) information retrieval techniques that work on well documented libraries to extract types and 2) usage patterns by analyzing a large repository of programs are proposed.
Typilus: neural type hints
TLDR
A graph neural network model is presented that predicts types by probabilistically reasoning over a program’s structure, names, and patterns and can employ one-shot learning to predict an open vocabulary of types, including rare and user-defined ones.
NL2Type: Inferring JavaScript Function Types from Natural Language Information
TLDR
NL2Type is presented, a learning-based approach for predicting likely type signatures of JavaScript functions using a recurrent, LSTM-based neural model that, after learning from an annotated code base, predicts function types for unannotated code.
LambdaNet: Probabilistic Type Inference using Graph Neural Networks
TLDR
This paper proposes a probabilistic type inference scheme for Typescript based on a graph neural network that can predict both standard types, like number or string, as well as user-defined types that have not been encountered during training.
Type4Py: Practical Deep Similarity Learning-Based Type Inference for Python
TLDR
A deep similarity learning-based hier-archical neural network model that learns to discriminate between similar and dissimilar types in a high-dimensional space, which results in clusters of types that can be inferred through the nearest neigh-bor search.
Sound, heuristic type annotation inference for Ruby
TLDR
InferDL, a novel Ruby type inference system that infers sound and useful type annotations by incorporating heuristics that guess types, is introduced and believed to represent a promising approach for inferring type annotations in dynamic languages.
...
...

References

SHOWING 1-10 OF 32 REFERENCES
Static type inference for Ruby
TLDR
Diamondback Ruby (DRuby), a tool that blends Ruby's dynamic type system with a static typing discipline, is described, which believes that DRuby takes a major step toward bringing the benefits of combined static and dynamic typing to Ruby and other object-oriented languages.
Python probabilistic type inference with natural language support
TLDR
This work proposes to use probabilistic inference to allow the beliefs of individual type hints to be propagated, aggregated, and eventually converge on probabilities of variable types in Python programs.
JSAI: a static analysis platform for JavaScript
TLDR
JSAI is described, a formally specified, robust abstract interpreter for JavaScript that uses novel abstract domains to compute a reduced product of type inference, pointer analysis, control-flow analysis, string analysis, and integer and boolean constant propagation.
Types for safe locking: Static race detection for Java
TLDR
A static race-detection analysis for multithreaded shared-memory programs, focusing on the Java programming language, based on a type system that captures many common synchronization patterns and two improvements that facilitate checking much larger programs are described.
Refined Criteria for Gradual Typing
TLDR
This paper draws a crisp line in the sand that includes a new formal property, named the gradual guarantee, that relates the behavior of programs that differ only with respect to their type annotations, and argues that the gradually guarantee provides important guidance for designers of gradually typed languages.
Jalangi: a selective record-replay and dynamic analysis framework for JavaScript
TLDR
This paper presents a simple yet powerful framework, called Jalangi, for writing heavy-weight dynamic analyses, which incorporates two key techniques: selective record-replay, a technique which enables to record and to faithfully replay a user-selected part of the program, and shadow values and shadow execution, which enables easy implementation of heavy- Weight dynamic analyses.
Recovering clear, natural identifiers from obfuscated JS names
TLDR
This paper describes an approach based on statistical machine translation (SMT) that recovers some of the original names from the JavaScript programs minified by the very popular UglifyJS, and introduces a new tool, JSNaughty, which blends Autonym and JSNice, and significantly outperforms both at identifier name recovery, while remaining just as easy to use as J SNice.
Determinacy in static analysis for jQuery
An empirical study on the impact of static typing on software maintainability
TLDR
An experiment that tests whether static type systems improve the maintainability of software systems, in terms of understanding undocumented code, fixing type errors, and fixing semantic errors shows rigorous empirical evidence that static types are indeed beneficial to these activities, except when fixing semantics errors.
Statistical Deobfuscation of Android Applications
TLDR
This work phrases the layout deobfuscation problem of Android APKs as structured prediction in a probabilistic graphical model, instantiates this model with a rich set of features and constraints that capture the Android setting, ensuring both semantic equivalence and high prediction accuracy, and shows how to leverage powerful inference and learning algorithms to achieve overall precision and scalability of the probabilism predictions.
...
...