• Corpus ID: 73728621

Program Classification Using Gated Graph Attention Neural Network for Online Programming Service

  title={Program Classification Using Gated Graph Attention Neural Network for Online Programming Service},
  author={Mingming Lu and Dingwu Tan and Naixue N. Xiong and Zailiang Chen and Haifeng Li},
The online programing services, such as Github,TopCoder, and EduCoder, have promoted a lot of social interactions among the service users. [...] Key Method To address this problem, we proposed a Graph Neural Network (GNN) based model, which integrates data flow and function call information to the AST,and applies an improved GNN model to the integrated graph, so as to achieve the state-of-art program classification accuracy. The experiment results have shown that the proposed work can classify programs with…Expand
2 Citations
Heterogeneous tree structure classification to label Java programmers according to their expertise level
A new approach to classify ASTs using traditional supervised-learning algorithms, where a feature learning process selects the most representative syntax patterns for the child subtrees of different syntax constructs are used to enrich the context information of each AST, allowing the classification of compound heterogeneous tree structures.
GRAPHSPY: Fused Program Semantic-Level Embedding via Graph Neural Networks for Dead Store Detection
This work presents a novel, hybrid program embedding approach so that to derive unnecessary memory operations through the embedding, which achieves 90% of accuracy and incurs only around a half of time overhead of the state-of-art tool.


Learning to Represent Programs with Graphs
This work proposes to use graphs to represent both the syntactic and semantic structure of code and use graph-based deep learning methods to learn to reason over program structures, and suggests that these models learn to infer meaningful names and to solve the VarMisuse task in many cases.
Cross-Language Learning for Program Classification using Bilateral Tree-Based Convolutional Neural Networks
This paper achieves over 90% accuracy in the cross-language binary classification task to tell whether any given two code snippets implement the same algorithm, and achieves over 80% precision for the algorithm classification task.
Convolutional Neural Networks over Tree Structures for Programming Language Processing
A novel tree-based convolutional neural network (TBCNN) is proposed for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information.
Gated Graph Sequence Neural Networks
This work studies feature learning techniques for graph-structured inputs and achieves state-of-the-art performance on a problem from program verification, in which subgraphs need to be matched to abstract data structures.
Learning Python Code Suggestion with a Sparse Pointer Network
A neural language model with a sparse pointer network aimed at capturing very long range dependencies and a qualitative analysis shows this model indeed captures interesting long-range dependencies, like referring to a class member defined over 60 tokens in the past.
A Convolutional Attention Network for Extreme Summarization of Source Code
An attentional neural network that employs convolution on the input tokens to detect local time-invariant and long-range topical attention features in a context-dependent way to solve the problem of extreme summarization of source code snippets into short, descriptive function name-like summaries is introduced.
Learning Embeddings of API Tokens to Facilitate Deep Learning Based Program Processing
This paper proposes a neural model to learn embeddings of API tokens that combines a recurrent neural network with a convolutional neural network and uses API documents as training corpus.
Deep API learning
DeepAPI is proposed, a deep learning based approach to generate API usage sequences for a given natural language query that adapts a neural language model named RNN Encoder-Decoder, and generates an API sequence based on the context vector.
Predicting Program Properties from "Big Code"
This work formulating the problem of inferring program properties as structured prediction and showing how to perform both learning and inference in this context opens up new possibilities for attacking a wide range of difficult problems in the context of "Big Code" including invariant generation, decompilation, synthesis and others.
Building Program Vector Representations for Deep Learning
This pioneering paper proposes the "coding criterion" to build program vector representations, which are the premise of deep learning for program analysis, and evaluates the learned vector representations both qualitatively and quantitatively.