• Publications
  • Influence
Deep Code Comment Generation
TLDR
DeepCom applies Natural Language Processing (NLP) techniques to learn from a large code corpus and generates comments from learned features for better comments generation of Java methods.
Summarizing Source Code with Transferred API Knowledge
TLDR
Experiments on large-scale real-world industry Java projects indicate that the proposed novel approach, named TL-CodeSum, is effective and outperforms the state-of-the-art in code summarization.
Deep code comment generation with hybrid lexical and syntactical information
TLDR
Experimental results demonstrate that the method Hybrid-DeepCom outperforms the state-of-the-art by a substantial margin and the results show that reducing the out- of-vocabulary tokens improves the accuracy effectively.
What Security Questions Do Developers Ask? A Large-Scale Study of Stack Overflow Posts
TLDR
A large-scale study on security-related questions on Stack Overflow, which summarizes all the topics into five main categories, and investigates the popularity and difficulty of different topics as well.
HYDRA: Massively Compositional Model for Cross-Project Defect Prediction
TLDR
Improvements of HYDRA over other baseline approaches in terms of F1-score and when inspecting the top 20 percent lines of code are substantial, and in most cases the improvements are significant and have large effect sizes across the 29 datasets.
Deep Learning for Just-in-Time Defect Prediction
TLDR
An approach Deeper is proposed which leverages deep learning techniques to predict defect-prone changes by leveraging a deep belief network algorithm and a machine learning classifier is built on the selected features.
Tag recommendation in software information sites
TLDR
This paper proposes TagCombine, an automatic tag recommendation method which analyzes objects in software information sites and recommends tags after analyzing the terms in the objects.
Practitioners' expectations on automated fault localization
TLDR
An empirical study is performed by surveying practitioners from more than 30 countries across 5 continents about their expectations of research in fault localization and investigates a number of factors that impact practitioners' willingness to adopt a fault localization technique.
Neural-Machine-Translation-Based Commit Message Generation: How Far Are We?
TLDR
A simpler and faster approach is proposed, named NNGen (Nearest Neighbor Generator), to generate concise commit messages using the nearest neighbor algorithm, which is over 2,600 times faster than NMT, and outperforms NMT in terms of BLEU by 21%.
Code Generation as a Dual Task of Code Summarization
TLDR
This paper proposes a dual training framework to train the two tasks simultaneously, and considers the dualities on probability and attention weights, and design corresponding regularization terms to constrain the duality.
...
1
2
3
4
5
...