Learn More
Topics are collections of words that co-occur frequently in a text corpus. Topics have been found to be effective tools for describing the major themes spanning a corpus. Using such topics to describe the evolution of a software system's source code promises to be extremely useful for development tasks such as maintenance and re-engineering. However, no one(More)
—We describe a robust and efficient system for recognizing typeset and handwritten mathematical notation. From a list of symbols with bounding boxes the system analyzes an expression in three successive passes. The Layout Pass constructs a Baseline Structure Tree (BST) describing the two-dimensional arrangement of input symbols. Reading order and operator(More)
A perspective view of a slanted textured surface shows systematic changes in the density, area, and aspect-ratio of texture elements. These apparent changes in texture element properties can be analyzed to recover information about the physical layout of the scene. However, in practice it is difficult to identify texture elements, especially in images where(More)
Studying the evolution of topics (collections of co-occurring words) in a software project is an emerging technique to automatically shed light on how the project is changing over time: which topics are becoming more actively developed, which ones are dying down, or which topics are lately more error-prone and hence require more testing. Existing techniques(More)
—Bug localization is the task of determining which source code entities are relevant to a bug report. Manual bug localization is labor intensive, since developers must consider thousands of source code entities. Current research builds bug localization classifiers, based on information retrieval models, to locate entities that are textually similar to the(More)
Performance evaluation of document recognition systems is a difficult and practically important problem. Issues arise in defining requirements, in characterizing the system's range of inputs and outputs, in interpreting published performance evaluation results, in reproducing performance evaluation experiments, in choosing training and test data, and in(More)