Tien N. Nguyen

Learn More
In today's software-centric world, ultra-large-scale software repositories, e.g. SourceForge (350,000+ projects), GitHub (250,000+ projects), and Google Code (250,000+ projects) are the new library of Alexandria. They contain an enormous corpus of software and information about software. Scientists and engineers alike are interested in analyzing this(More)
Model-Driven Engineering (MDE) has become an important development framework for many large-scale software. Previous research has reported that as in traditional code-based development, cloning also occurs in MDE. However, there has been little work on clone detection in models with the limitations on detection precision and completeness. This paper(More)
Previous research confirms the existence of <i>recurring bug fixes</i> in software systems. Analyzing such fixes manually, we found that a large percentage of them occurs in <i>code peers</i>, the classes/methods having the similar roles in the systems, such as providing similar functions and/or participating in similar object interactions. Based on(More)
Locating buggy code is a time-consuming task in software development. Given a new bug report, developers must search through a large number of files in a project to locate buggy code. We propose BugScout, an automated approach to help developers reduce such efforts by narrowing the search space of buggy files when they are assigned to address a bug report.(More)
<i>n</i>-gram statistical language model has been successfully applied to capture programming patterns to support code completion and suggestion. However, the approaches using <i>n</i>-gram face challenges in capturing the patterns at higher levels of abstraction due to the mismatch between the sequence nature in <i>n</i>-grams and the structure nature of(More)
Bug triaging aims to assign a bug to the most appropriate fixer. That task is crucial in reducing time and efforts in a bug fixing process. In this paper, we propose Bugzie, a novel approach for automatic bug triaging based on fuzzy set and cache-based modeling of the bug-fixing expertise of developers. Bugzie considers a software system to have multiple(More)
Recent research results suggest a need for code clone management. In this paper, we introduce JSync, a novel clone management tool. JSync provides two main functions to support developers in being aware of the clone relation among code fragments as software systems evolve and in making consistent changes as they create or modify cloned code. JSync(More)
Recent research has successfully applied the statistical n-gram language model to show that source code exhibits a good level of repetition. The n-gram model is shown to have good predictability in supporting code suggestion and completion. However, the state-of-the-art n-gram approach to capture source code regularities/patterns is based only on the(More)
Reusing existing library components is essential for reducing the cost of software development and maintenance. When library components evolve to accommodate new feature requests, to fix bugs, or to meet new standards, the clients of software libraries often need to make corresponding changes to correctly use the updated libraries. Existing API usage(More)
The interplay of multiple objects in object-oriented programming often follows specific protocols, for example certain orders of method calls and/or control structure constraints among them that are parts of the intended object usages. Unfortunately, the information is not always documented. That creates long learning curve, and importantly, leads to subtle(More)