Corpus ID: 211252650

# Learning to Represent Programs with Property Signatures

@article{Odena2020LearningTR,
title={Learning to Represent Programs with Property Signatures},
author={Augustus Odena and Charles Sutton},
journal={ArXiv},
year={2020},
volume={abs/2002.09030}
}
• Published 13 February 2020
• Computer Science
• ArXiv
We introduce the notion of property signatures, a representation for programs and program specifications meant for consumption by machine learning algorithms. Given a function with input type $\tau_{in}$ and output type $\tau_{out}$, a property is a function of type: $(\tau_{in}, \tau_{out}) \rightarrow \texttt{Bool}$ that (informally) describes some simple property of the function under consideration. For instance, if $\tau_{in}$ and $\tau_{out}$ are both lists of the same type, one property… Expand
16 Citations

#### Figures and Topics from this paper

Latent Execution for Neural Program Synthesis
• Xinyun Chen, Dawn Song, Yuandong Tian
• Computer Science
• 2021
LaSynth learns the latent representation to approximate the execution of partially generated programs, even if they are incomplete in syntax, and significantly improves the performance of next token prediction over existing approaches, facilitating search. Expand
Context-Aware Parse Trees
CAPT enhances SPT by providing a richer level of semantic representation, and quantitatively demonstrates the value of the proposed semantically-salient features, enabling a specific CAPT configuration to be 39\% more accurate than SPT across the 48,610 programs the authors analyzed. Expand
Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages
LaSynth is a model that learns the latent representation to approximate the execution of partially generated programs, even if they are incomplete in syntax, and significantly improves the performance of next token prediction over existing approaches, facilitating search. Expand
MISIM: A Neural Code Semantics Similarity System Using the Context-Aware Semantics Structure
This work presents Machine Inferred Code Similarity (MISIM), a neural code semantics similarity system consisting of two core components: a novel context-aware semantics structure, which was purpose-built to lift semantics from code syntax and an extensible neural code similarity scoring algorithm, which can be used for various neural network architectures with learned parameters. Expand
BUSTLE: Bottom-up program-Synthesis Through Learning-guided Exploration
• Computer Science, Mathematics
• ICLR
• 2021
A new synthesis approach that leverages learning to guide a bottom-up search over programs, and trains a model to prioritize compositions of intermediate values during search conditioned on a given set of input-output examples. Expand
Sketch-Driven Regular Expression Generation from Natural Language and Examples
• Computer Science
• Transactions of the Association for Computational Linguistics
• 2020
This work presents a framework for regex synthesis in this setting where both natural language (NL) and examples are available, and achieves state-of-the-art performance on the prior datasets and solves 57% of the real-world dataset, which existing neural systems completely fail on. Expand
Program Synthesis with Large Language Models
The limits of the current generation of large language models for program synthesis in general purpose programming languages are explored, and the semantic grounding of these models is explored by fine-tuning them to predict the results of program execution. Expand
Learning to Combine Per-Example Solutions for Neural Program Synthesis
• Computer Science
• ArXiv
• 2021
Evaluation across programs of different lengths and under two different experimental settings reveal that when given the same time budget, the Cross Aggregator neural network module significantly improves the success rate over PCCoder and other ablation baselines. Expand
MISIM: A Novel Code Similarity System.
This work presents machine Inferred Code Similarity (MISIM), a novel end-to-end code similarity system that consists of two core components: a novel context-aware semantic structure and a neural-based code similarity scoring algorithm that can be implemented with various neural network architectures with learned parameters. Expand
ControlFlag: a self-supervised idiosyncratic pattern detection system for software control structures
• Computer Science
• MAPS@PLDI
• 2021
ControlFlag is presented, a self-supervised MP system that aims to improve debugging by attempting to detect idiosyncratic pattern violations in software control structures and suggests possible corrections in the event an anomalous pattern is detected. Expand

#### References

SHOWING 1-10 OF 46 REFERENCES
Synthesizing data structure transformations from input-output examples
• Computer Science
• PLDI 2015
• 2015
We present a method for example-guided synthesis of functional programs over recursive data structures. Given a set of input-output examples, our method synthesizes a program in a functional languageExpand
Refinement types for ML
• Computer Science
• PLDI '91
• 1991
A type system called refinement types is described, which is an example of a new way to make this tradeoff, as well as a potentially useful system in itself. Expand
Types and programming languages
This text provides a comprehensive introduction both to type systems in computer science and to the basic theory of programming languages, with a variety of approaches to modeling the features of object-oriented languages. Expand
Automatic Program Synthesis of Long Programs with a Learned Garbage Collector
• Computer Science, Mathematics
• NeurIPS
• 2018
Using this method, the problem of generating automatic code given sample input-output pairs is considered, and programs that are more than twice as long as existing state-of-the-art solutions are created while improving the success rate for comparable lengths, and cutting the run-time by two orders of magnitude. Expand
Program synthesis from polymorphic refinement types
• Computer Science
• PLDI
• 2016
The tool was able to synthesize more complex programs than those reported in prior work, as well as most of the benchmarks tackled by existing synthesizers, often starting from a more concise and intuitive user input. Expand
code2vec: learning distributed representations of code
• Computer Science, Mathematics
• Proc. ACM Program. Lang.
• 2019
A neural model for representing snippets of code as continuous distributed vectors as a single fixed-length code vector which can be used to predict semantic properties of the snippet, making it the first to successfully predict method names based on a large, cross-project corpus. Expand
Synthesis Through Unification
• Computer Science
• CAV
• 2015
This work presents the synthesis through unification (STUN) approach, which is an extension of the counter-example guided inductive synthesis approach, and picks a program from the program space that is correct for the new set S. Expand
Automating string processing in spreadsheets using input-output examples
The design of a string programming/expression language that supports restricted forms of regular expressions, conditionals and loops is described and an algorithm based on several novel concepts for synthesizing a desired program in this language is described from input-output examples. Expand
Neural-Guided Deductive Search for Real-Time Program Synthesis from Examples
• Computer Science
• ICLR
• 2018
This work proposes Neural Guided Deductive Search (NGDS), a hybrid synthesis technique that combines the best of both symbolic logic techniques and statistical models and produces programs that satisfy the provided specifications by construction and generalize well on unseen examples, similar to data-driven systems. Expand
RobustFill: Neural Program Learning under Noisy I/O
• Computer Science
• ICML
• 2017
This work directly compares both approaches for automatic program learning on a large-scale, real-world learning task and demonstrates that the strength of each approach is highly dependent on the evaluation metric and end-user application. Expand