• Publications
  • Influence
Portfolio: finding relevant functions and their usage
Different studies show that programmers are more interested in finding definitions of functions and their uses than variables, statements, or arbitrary code fragments [30, 29, 31]. Therefore,Expand
  • 228
  • 18
Detecting similar software applications
Although popular text search engines allow users to retrieve similar web pages, source code search engines do not have this feature. Detecting similar applications is a notoriously difficult problem,Expand
  • 115
  • 17
  • PDF
Automatically generating commit messages from diffs using neural machine translation
Commit messages are a valuable resource in comprehension of software evolution, since they provide a record of changes such as feature additions and bug repairs. Unfortunately, programmers oftenExpand
  • 74
  • 12
  • PDF
Automatic documentation generation via source code summarization of method context
A documentation generator is a programming tool that creates documentation for software by analyzing the statements and comments in the software's source code. While many of these tools are manual,Expand
  • 126
  • 10
  • PDF
Automatic Source Code Summarization of Context for Java Methods
Source code summarization is the task of creating readable summaries that describe the functionality of software. Source code summarization is a critical component of documentation generation, forExpand
  • 61
  • 9
  • PDF
On using machine learning to automatically classify software applications into domain categories
Software repositories hold applications that are often categorized to improve the effectiveness of various maintenance tasks. Properly categorized applications allow stakeholders to identifyExpand
  • 53
  • 8
  • PDF
When and How Using Structural Information to Improve IR-Based Traceability Recovery
Information Retrieval (IR) has been widely accepted as a method for automated traceability recovery based on the textual similarity among the software artifacts. However, a notorious difficulty forExpand
  • 47
  • 8
  • PDF
Exemplar: A Source Code Search Engine for Finding Highly Relevant Applications
A fundamental problem of finding software applications that are highly relevant to development tasks is the mismatch between the high-level intent reflected in the descriptions of these tasks andExpand
  • 98
  • 6
  • PDF
An empirical investigation into a large-scale Java open source code repository
Getting insight into different aspects of source code artifacts is increasingly important -- yet there is little empirical research using large bodies of source code, and subsequently there are notExpand
  • 91
  • 6
  • PDF
A Neural Model for Generating Natural Language Summaries of Program Subroutines
Source code summarization -- creating natural language descriptions of source code behavior -- is a rapidly-growing research topic with applications to automatic documentation generation, programExpand
  • 21
  • 6
  • PDF