Learn More
Predicting the incidence of faults in code has been commonly associated with measuring complexity. In this paper, we propose complexity metrics that are based on the code change process instead of on the code. We conjecture that a complex code change process negatively affects its product, i.e., the software system. We validate our hypothesis empirically(More)
Programming question and answer (Q&A) websites, such as Stack Overflow, leverage the knowledge and expertise of users to provide answers to technical questions. Over time, these websites turn into repositories of software engineering knowledge. Such knowledge repositories can be invaluable for gaining insight into the use of specific technologies and the(More)
Software systems contain entities, such as functions and variables, which are related to each other. As a software system evolves to accommodate new features and repair bugs, changes occur to these entities. Developers must ensure that related entities are updated to be consistent with these changes. This paper addresses the question: How does a change in(More)
To aid software analysis and maintenance tasks, a number of software clustering algorithms have been proposed to automatically partition a software system into meaningful subsystems or clusters. However, it is unknown whether these algorithms produce similar meaningful clusterings for similar versions of a real-life software system under continual change(More)
Bug prediction models are often used to help allocate software quality assurance efforts (e.g. testing and code reviews). Mende and Koschke have recently proposed bug prediction models that are effort-aware. These models factor in the effort needed to review or test code when evaluating the effectiveness of prediction models, leading to more realistic(More)
Defect prediction models are a well-known technique for identifying defect-prone files or packages such that practitioners can allocate their quality assurance efforts (e.g., testing and code reviews). However, once the critical files or packages have been identified, developers still need to spend considerable time drilling down to the functions or even(More)
Defect prediction models help software quality assurance teams to effectively allocate their limited resources to the most defect-prone software modules. A variety of classification techniques have been used to build defect prediction models ranging from simple (e.g., logistic regression) to advanced techniques (e.g., Multivariate Adaptive Regression(More)
Risk assessment is an essential part in managing software development. Performing risk assessment during the early development phases enhances resource allocation decisions. In order to improve the software development process and the quality of software products, we need to be able to build risk analysis models based on data that can be collected early in(More)
Developer mailing lists are a rich source of information about Open Source Software (OSS) development. The unstructured nature of email makes extracting information difficult. We use a psychometrically-based linguistic analysis tool, the LIWC, to examine the Apache httpd server developer mailing list. We conduct three preliminary experiments to assess the(More)
Software code review, i.e., the practice of having third-party team members critique changes to a software system, is a well-established best practice in both open source and proprietary software domains. Prior work has shown that the formal code inspections of the past tend to improve the quality of software delivered by students and small teams. However,(More)