Jigsaw: Large Language Models meet Program Synthesis

@article{Jain2021JigsawLL,
  title={Jigsaw: Large Language Models meet Program Synthesis},
  author={Naman Jain and Skanda Vaidyanath and Arun Shankar Iyer and Nagarajan Natarajan and Suresh Parthasarathy and Sriram K. Rajamani and Rahul Sharma},
  journal={2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)},
  year={2021},
  pages={1219-1231}
}
Large pre-trained language models such as GPT-3 [10], Codex [11], and Coogle's language model [7] are now capable of generating code from natural language specifications of programmer intent. We view these developments with a mixture of optimism and caution. On the optimistic side, such large language models have the potential to improve productivity by providing an automated AI pair programmer for every programmer in the world. On the cautionary side, since these large language models do not… 

Figures and Tables from this paper

Interactive Code Generation via Test-Driven User-Intent Formalization

This paper proposes the workflow of test-driven user-intent formalization (TDUIF), which leverages lightweight user feedback to jointly formalize the user intent as tests (a partial specification), and generates code that meets the formal user intent.

Automated Repair of Programs from Large Language Models

The study revealed that automatically generated code shares common programming mistakes with human-crafted solutions, indicating APR techniques may have potential to auto-generated code, and given bug location information provided by a statistical fault localization approach, Codex edit mode is similar to or better than existing Java repair tools TBar and Recoder in correcting incorrect solutions.

Automated Repair of Code from Language Models

The study revealed that automatically generated code shares common programming mistakes with human-crafted solutions, indicating APR techniques may have potential to auto-generated code, and given bug location information provided by a statistical fault localization approach, Codex edit mode is similar to or better than existing Java repair tools TBar and Recoder in correcting incorrect solutions.

Improving automatically generated code from Codex via Automated Program Repair

This study systematically study whether automated program repair (APR) techniques can fix the incorrect solutions produced by language models in LeetCode contests, revealing that automatically generated codes share some common programming mistakes with human-crafted solutions, indicating existing APR tools have the potential to fix auto-generated code.

When Language Model Meets Private Library

This paper investigates how to equip pre-trained language models with the ability of code generation for private libraries, and proposes a novel framework with two modules: the APIRetriever and the APICoder, which generates code using these APIs.

Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code

This paper studies to what extent a state-of-the-art, pre-trained language model of code, Codex, may serve this purpose, and concludes that few-shot language models are surprisingly effective.

Natural Language to Code Generation in Interactive Data Science Notebooks

P A C H - I NC O, a 62B code language model for Python computational notebooks, which outperforms public code LMs and explores few-shot prompting strategies to elicit better code with step-by-step decomposition and NL explanation, showing the potential to improve the diversity and explainability of model predictions.

Large Language Models Are Human-Level Prompt Engineers

It is shown that APE-engineered prompts can be applied to steer models toward truthfulness and/or informativeness, as well as to improve few-shot learning performance by simply prepending them to standard in-context learning prompts.

ObSynth: An Interactive Synthesis System for Generating Object Models from Natural Language Specifications

ObSynth is introduced, an interactive system leveraging the domain knowledge em-bedded in large language models (LLMs) to help users design object models from high level natural language prompts, showing that it often synthesizes objects, methods, and methods users might have otherwise omitted.

CodexDB: Synthesizing Code for Query Processing from Natural Language Instructions using GPT-3 Codex

The CodexDB framework is a framework on top of GPT-3 Codex that decomposes complex SQL queries into a series of simple processing steps, described in natural language, that enables users to customize SQL query processing via natural language instructions.

References

SHOWING 1-10 OF 42 REFERENCES

Synchromesh: Reliable code generation from pre-trained language models

A framework for substantially improving the reliability of pre-trained models for code generation and observing substantial complementary gains from CSD and TST in prediction accuracy and in effectively preventing run-time errors is proposed.

Evaluating Large Language Models Trained on Code

It is found that repeated sampling from the GPT language model is a surprisingly effective strategy for producing working solutions to difficult prompts, and the potential broader impacts of deploying powerful code generation technologies, covering safety, security, and economics are discussed.

Code completion with statistical language models

The main idea is to reduce the problem of code completion to a natural-language processing problem of predicting probabilities of sentences, and design a simple and scalable static analysis that extracts sequences of method calls from a large codebase, and index these into a statistical language model.

AutoPandas: neural-backed generators for program synthesis

This work introduces neural-backed operators which can be seamlessly integrated into the program generator and uses these operators at non-deterministic decision points, instead of relying on domain-specific heuristics for the efficiency of the search.

Maximal multi-layer specification synthesis

A hybrid model that combines the power of an LSTM-based sequence-to-sequence model with the apriori algorithm for mining association rules through unsupervised learning is proposed, which reduces the problem of solving a multi-layer specification synthesis to a Max-SMT problem.

PyMT5: Multi-mode Translation of Natural Language and Python Code with Transformers

This work introduces PyMT5, the Python method text-to-text transfer transformer, which is trained to translate between all pairs of Python method feature combinations: a single model that can both predict whole methods from natural language documentation strings (docstrings) and summarize code into docstrings of any common style.

Multi-modal program inference: a marriage of pre-trained language models and component-based synthesis

This work presents an approach that combines PTMs with component-based synthesis (CBS): PTMs are used to generate candidates programs from the natural language description of the task, which are then used to guide the CBS procedure to find the program that matches the precise examples-based specification.

Accelerating search-based program synthesis using learned probabilistic models

A weighted search algorithm to efficiently enumerate programs in order of their likelihood and a method based on transfer learning that enables to effectively learn a powerful model, called probabilistic higher-order grammar, from known solutions in a domain.

Compositional Program Synthesis from Natural Language and Examples

This paper presents a domain-agnostic program synthesis algorithm and demonstrates its application to an expressive string manipulation language and evaluates the approach on complex tasks from online help forums that are beyond the scope of current state-ofthe-art methods.

RobustFill: Neural Program Learning under Noisy I/O

This work directly compares both approaches for automatic program learning on a large-scale, real-world learning task and demonstrates that the strength of each approach is highly dependent on the evaluation metric and end-user application.