CURE: Code-Aware Neural Machine Translation for Automatic Program Repair

@article{Jiang2021CURECN,
  title={CURE: Code-Aware Neural Machine Translation for Automatic Program Repair},
  author={Nan Jiang and Thibaud Lutellier and Lin Tan},
  journal={2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)},
  year={2021},
  pages={1161-1173}
}
Automatic program repair (APR) is crucial to improve software reliability. Recently, neural machine translation (NMT) techniques have been used to automatically fix software bugs. While promising, these approaches have two major limitations. Their search space often does not contain the correct fix, and their search strategy ignores software knowledge such as strict code syntax. Due to these limitations, existing NMT-based techniques underperform the best template-based approaches. We propose… 

Figures and Tables from this paper

On Multi-Modal Learning of Editing Source Code
TLDR
Modit, a multi-modal NMT based code editing engine, shows that developers’ hint as an input modality can narrow the search space for patches and outperform state-of-the-art models to generate correctly patched code in top-1 position.
Sirius: Static Program Repair with Dependence Graph-Based Systematic Edit Patterns
TLDR
A general-purpose program transformation algorithm for applying PDG-based SEPs and a program repair pipeline Sirius is constructed that incorporates the algorithm and automates the processes of mining SEPs, detecting overlooked code locations that require systematic edits, and repairing them by applying SEPs.
An Empirical Cybersecurity Evaluation of GitHub Copilot's Code Contributions
TLDR
This work systematically investigates the prevalence and conditions that can cause GitHub Copilot to recommend insecure code, and explores Copilot’s performance on three distinct code generation axes—examining how it performs given diversity of weaknesses, diversity of prompts, and diversity of domains.
CIRCLE: continual repair across programming languages
TLDR
A T5-based APR framework equipped with continual learning ability across multiple programming languages is proposed, namely ContInual Repair aCross Programming LanguagEs (CIRCLE), which utilizes a prompting function to narrow the gap between natural language processing (NLP) pre-trained tasks and APR.
GLAD: Neural Predicate Synthesis to Repair Omission Faults
TLDR
GLAD is a novel learning-based repair technique that specifically targets if-clause synthesis, which is highly orthogonal to existing techniques and can correctly fix 16 Defects4J v1.2 faults that previous NMT-based techniques could not, while maintaining a reasonable runtime cost.
Neural Program Repair with Execution-based Backpropagation
TLDR
The core novelty of RewardRepair is to improve NMT-based program repair with a loss function based on program compilation and test execution information, rewarding the network to produce patches that compile and that do not overfit.
Improving Fault Localization and Program Repair with Deep Semantic Features and Transferred Knowledge
TLDR
This work proposes a novel approach called TRANSFER, which lever-ages the deep semantic features and transferred knowledge from open-source data to improve fault localization and program repair and outperforms all baselines in fault localization, and is better than existing deep-learning methods in automated program repair.
DEAR: A Novel Deep Learning-based Approach for Automated Program Repair
  • Yi Li
  • Computer Science
  • 2022
TLDR
The existing deep learning (DL)-based automated program repair (APR) models are limited in fixing general software defects, so this work designs a two-tier, tree-based LSTM model that incorporates cycle training and uses a divide-and-conquer strategy to learn proper code transformations for fixing multiple statements in the suitable fixing context consisting of surrounding subtrees.
DEAR: A Novel Deep Learning-based Approach for Automated Program Repair
TLDR
A novel fault localization (FL) technique for multi-hunk, multi-statement fixes that combines traditional spectrum-based (SB) FL with deep learning and data-flow analysis and a two-tier, tree-based LSTM model that incorporates cycle training and uses a divide-and-conquer strategy to learn proper code transformations.
SelfAPR: Self-supervised Program Repair with Test Execution Diagnostics
TLDR
The learning paradigm is changed from supervised training to self-supervised training in an approach called SelfAPR, which generates and constructs training samples by perturbing a previous version of the program being repaired, and extracts and encodes test execution diagnostics into the input representation, steering the neural model to fix the specific kind of fault.
...
...

References

SHOWING 1-10 OF 94 REFERENCES
CoCoNuT: combining context-aware neural translation models using ensemble for program repair
TLDR
A new G&V technique—CoCoNuT, which uses ensemble learning on the combination of convolutional neural networks (CNNs) and a new context-aware neural machine translation (NMT) architecture to automatically fix bugs in multiple programming languages.
DLFix: Context-based Code Transformation Learning for Automated Program Repair
TLDR
DLFix is a two-tier DL model that treats APR as code transformation learning from the prior bug fixes and the surrounding code contexts of the fixes, and does not require hard-coding of bug-fixing patterns as in those tools.
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
TLDR
This study presents an APR tool, LSRepair, that automatically explores code repositories to search for fix ingredients at the method-level granularity with three strategies of similar code search, and argues that code search can drive a faster fix process.
DeepDelta: learning to repair compilation errors
TLDR
A novel approach that automatically learns patterns with a deep neural network and suggests program repairs for the most costly classes of build-time compilation failures, namely missing symbols and mismatched method signatures is proposed.
ARJA: Automated Repair of Java Programs via Multi-Objective Genetic Programming
TLDR
This paper proposes ARJA, a new GP based repair approach for automated repair of Java programs, and presents a novel lower-granularity patch representation that properly decouples the search subspaces of likely-buggy locations, operation types and potential fix ingredients, enabling GP to explore the search space more effectively.
Leveraging syntax-related code for automated program repair
  • Qi Xin, S. Reiss
  • Computer Science
    2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)
  • 2017
TLDR
The results show that ssFix successfully repaired 20 bugs with valid patches generated and that it outperformed five other repair techniques for Java.
Practical program repair via bytecode mutation
TLDR
This paper implements the first practical bytecode-level APR technique, PraPR, and presents the first extensive study on fixing real-world bugs using JVM bytecode mutation, and demonstrates the overfitting problem of recent advanced APR tools for the first time.
History Driven Program Repair
TLDR
This work proposes a new technique that utilizes the wealth of bug fixes across projects in their development history to effectively guide and drive a program repair process, and can produce good-quality fixes for many more bugs as compared to the baselines while beingreasonably computationally efficient.
DeepFix: Fixing Common C Language Errors by Deep Learning
TLDR
DeepFix is a multi-layered sequence-to-sequence neural network with attention which is trained to predict erroneous program locations along with the required correct statements and could fix 1881 programs completely and 1338 programs partially.
Automatic patch generation learned from human-written patches
TLDR
A novel patch generation approach, Pattern-based Automatic program Repair (Par), using fix patterns learned from existing human-written patches to generate program patches automatically, which is more acceptable than GenProg.
...
...