Share This Author
Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports
- Jian Zhou, Hongyu Zhang, D. Lo
- Computer ScienceInternational Conference on Software Engineering
- 1 June 2012
The results show that BugLocator can effectively locate the files where the bugs should be fixed, and outperforms existing state-of-the-art bug localization methods.
Deep Code Comment Generation
- Xing Hu, Ge Li, Xin Xia, D. Lo, Zhi Jin
- Computer ScienceIEEE International Conference on Program…
- 1 May 2018
DeepCom applies Natural Language Processing (NLP) techniques to learn from a large code corpus and generates comments from learned features for better comments generation of Java methods.
Towards more accurate retrieval of duplicate bug reports
- Chengnian Sun, D. Lo, Siau-Cheng Khoo, Jing Jiang
- Computer ScienceInternational Conference on Automated Software…
- 1 November 2011
A retrieval function (REP) to measure the similarity between two bug reports, which fully utilizes the information available in a bug report including not only the similarity of textual content in summary and description fields, but also similarity of non-textual fields such as product, component, version, etc.
History Driven Program Repair
- Xuan-Bach D. Le, D. Lo, Claire Le Goues
- Computer ScienceIEEE International Conference on Software…
- 14 March 2016
This work proposes a new technique that utilizes the wealth of bug fixes across projects in their development history to effectively guide and drive a program repair process, and can produce good-quality fixes for many more bugs as compared to the baselines while beingreasonably computationally efficient.
Summarizing Source Code with Transferred API Knowledge
- Xing Hu, Ge Li, Xin Xia, D. Lo, Shuai Lu, Zhi Jin
- Computer ScienceInternational Joint Conference on Artificial…
- 1 July 2018
Experiments on large-scale real-world industry Java projects indicate that the proposed novel approach, named TL-CodeSum, is effective and outperforms the state-of-the-art in code summarization.
A discriminative model approach for accurate duplicate bug report retrieval
- Chengnian Sun, D. Lo, Xiaoyin Wang, Jing Jiang, Siau-Cheng Khoo
- Computer ScienceACM/IEEE 32nd International Conference on…
- 1 May 2010
This paper leverages recent advances on using discriminative models for information retrieval to detect duplicate bug reports more accurately and shows that this technique could result in 17--31%, 22--26%, and 35--43% relative improvement over state-of-the-art techniques in OpenOffice, Firefox, and Eclipse datasets respectively using commonly available natural language information only.
Deep code comment generation with hybrid lexical and syntactical information
Experimental results demonstrate that the method Hybrid-DeepCom outperforms the state-of-the-art by a substantial margin and the results show that reducing the out- of-vocabulary tokens improves the accuracy effectively.
What Security Questions Do Developers Ask? A Large-Scale Study of Stack Overflow Posts
- Xinli Yang, D. Lo, Xin Xia, Zhiyuan Wan, Jianling Sun
- Computer ScienceJournal of Computational Science and Technology
- 5 September 2016
A large-scale study on security-related questions on Stack Overflow, which summarizes all the topics into five main categories, and investigates the popularity and difficulty of different topics as well.
Version history, similar report, and structure: putting them together for improved bug localization
A new method for locating relevant buggy files that puts together version history, similar reports, and structure is proposed, and a large-scale experiment is performed on four open source projects to localize more than 3,000 bugs.
HYDRA: Massively Compositional Model for Cross-Project Defect Prediction
- Xin Xia, D. Lo, Sinno Jialin Pan, Nachiappan Nagappan, Xinyu Wang
- Computer ScienceIEEE Transactions on Software Engineering
- 1 October 2016
Improvements of HYDRA over other baseline approaches in terms of F1-score and when inspecting the top 20 percent lines of code are substantial, and in most cases the improvements are significant and have large effect sizes across the 29 datasets.