• Corpus ID: 235377032

# Energy-Based Models for Code Generation under Compilability Constraints

@article{Korbak2021EnergyBasedMF,
title={Energy-Based Models for Code Generation under Compilability Constraints},
author={Tomasz Korbak and Hady ElSahar and Marc Dymetman and Germ{\'a}n Kruszewski},
journal={ArXiv},
year={2021},
volume={abs/2106.04985}
}
• Published 9 June 2021
• Computer Science
• ArXiv
Neural language models can be successfully trained on source code, leading to applications such as code completion. However, their versatile autoregressive self-supervision objective overlooks important global sequence-level features that are present in the data such as syntactic correctness or compilability. In this work, we pose the problem of learning to generate compilable code as constraint satisfaction. We define an Energy-Based Model (EBM) representing a pre-trained generative model with…
6 Citations

## Figures and Tables from this paper

### Compilable Neural Code Generation with Compiler Feedback

• Computer Science
FINDINGS
• 2022
To improve compilability of the generated programs, this paper proposes COMPCODER, a three-stage pipeline utilizing compiler feedback for compilable code generation, including language model fine-tuning,Compilability reinforcement, and compilabilities discrimination.

### Controlling Conditional Language Models without Catastrophic Forgetting

• Computer Science
• 2021
DPG is extended to conditional tasks by proposing Conditional DPG (CDPG), and results show thatne-tuning using CDPG robustly moves these pretrained models closer towards meeting control objectives and — in contrast with baseline approaches — does not result in catastrophic forgetting.

### Controlling Conditional Language Models with Distributional Policy Gradients

• Computer Science
ArXiv
• 2021
The results show that fine-tuning using CDPG robustly moves these pretrained models closer towards meeting control objectives and — in contrast with baseline approaches — does not result in catastrophic forgetting.

### RL with KL penalties is better viewed as Bayesian inference

• Computer Science
ArXiv
• 2022
This paper analyzes challenges associated with treating a language model as an RL policy and shows how avoiding those challenges requires moving beyond the RL paradigm, and shows that KL-regularised RL is equivalent to variational inference: approximating a Bayesian posterior which informs how to update a prior LM to conform with evidence provided by the reward function.

### AXIMIZATION AND D ISTRIBUTION INE -T UNING L

• Computer Science
• 2021
The intimate connections between the two paradigms are explored, and it is shown that methods such as KL-control developed in the RM paradigm can be construed as belonging to DM, and that while DM differs from RM, it can suffer from similar training difficulties, such as high gradient variance.

### On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting

• Computer Science
ArXiv
• 2022
The theoretical connections between the two paradigms, and it is shown that methods such as KL-control developed for RM can also be construed as belonging to DM, are explored, and that while DM differs from RM, it can suffer from similar training difﬁculties, such as high gradient variance.

## References

SHOWING 1-10 OF 42 REFERENCES

### Neural Code Completion

• Computer Science
• 2017
This paperores the use of neural network techniques to automatically learn code completion from a large corpus of dynamically typed JavaScript code, and shows different neural networks that leverage not only token level information but also structural information, and evaluates their performance on different prediction tasks.

### CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

• Computer Science
NeurIPS Datasets and Benchmarks
• 2021
This paper introduces CodeXGLUE, a benchmark dataset to foster machine learning research for program understanding and generation that includes a collection of 10 tasks across 14 datasets and a platform for model evaluation and comparison.

### Structured Generative Models of Natural Source Code

• Computer Science
ICML
• 2014
A family of generative models for NSC that have three key properties: first, they incorporate both sequential and hierarchical structure, second, they learn a distributed representation of source code elements, and third, they integrate closely with a compiler.

### An Empirical Study on the Usage of BERT Models for Code Completion

• Computer Science
2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)
• 2021
A large-scale empirical study aimed at exploring the capabilities of state-of-the-art deep learning (DL) models in supporting code completion at different granularity levels, including single tokens, one or multiple entire statements, up to entire code blocks.

### Code Completion using Neural Attention and Byte Pair Encoding

• Computer Science
ArXiv
• 2020
This paper uses an encoding that is in-between character and word encoding called Byte Pair Encoding (BPE) and uses this on the source code files treating them as natural text without first going through the abstract syntax tree (AST).

### PHOG: Probabilistic Model for Code

• Computer Science
ICML
• 2016
PHOG generalizes probabilistic context free grammars (PCFGs) by allowing conditioning of a production rule beyond the parent non-terminal, thus capturing rich contexts relevant to programs.

### HuggingFace's Transformers: State-of-the-art Natural Language Processing

• Computer Science
ArXiv
• 2019
The \textit{Transformers} library is an open-source library that consists of carefully engineered state-of-the art Transformer architectures under a unified API and a curated collection of pretrained models made by and available for the community.

### A Survey of Machine Learning for Big Code and Naturalness

• Computer Science
ACM Comput. Surv.
• 2018
This article presents a taxonomy based on the underlying design principles of each model and uses it to navigate the literature and discuss cross-cutting and application-specific challenges and opportunities.

### An Actor-Critic Algorithm for Sequence Prediction

• Computer Science
ICLR
• 2017
An approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL) that condition the critic network on the ground-truth output, and shows that this method leads to improved performance on both a synthetic task, and for German-English machine translation.

### A statistical semantic language model for source code

• Computer Science
ESEC/FSE 2013
• 2013
SLAMC is introduced, a novel statistical semantic language model for source code that incorporates semantic information into code tokens and models the regularities/patterns of such semantic annotations, called sememes, rather than their lexemes.