Martin J. Shepperd

Learn More
Accurate project effort prediction is an important goal for the software engineering community. To date most work has focused upon building algorithmic models of effort, for example COCOMO. These can be calibrated to local environments. We describe an alternative approach to estimation based upon the use of analogies. The underlying principle is to(More)
This paper aims to provide the software estimation research community with a better understanding of the meaning of, and relationship between, two statistics that are often used to assess the accuracy of predictive models: the mean magnitude relative error, MMRE, and the number of predictions within 25% of the actuals, pred(25). We demonstrate that MMRE and(More)
Context: Software engineering has a problem in that when we empirically evaluate competing prediction systems we obtain conflicting results. Objective: To reduce the inconsistency amongst validation study results and provide a more formal foundation to interpret results with a particular focus on continuous prediction systems. Method: A new framework is(More)
This paper aims to provide a basis for the improvement of software-estimation research through a systematic review of previous work. The review identifies 304 software cost estimation papers in 76 journals and classifies the papers according to research topic, estimation approach, research approach, study context and data set. A Web-based library of these(More)
This paper describes an empirical investigation into an industrial object-oriented (OO) system comprising 133,000 lines of C++. The system was a sub system of a telecommunications product and was developed using the Shlaer-Mellor method. From this study we found that there was little use of OO constructs such as inheritance and therefore polymorphism. It(More)
The staff resources or effort required for a software project are notoriously difficult to estimate in advance. To date most work has focused upon algorithmic cost models such as COCOMO and Function Points. These can suffer from the disadvantage of the need to calibrate the model to each individual measurement environment coupled with very variable accuracy(More)
This paper investigates the use of various techniques including genetic programming, with public data sets, to attempt to model and hence estimate software project effort. The main research question is whether genetic programs can offer 'better' solution search using public domain metrics rather than company specific ones. Unlike most previous research, a(More)
ÐThe need for accurate software prediction systems increases as software becomes much larger and more complex. A variety of techniques have been proposed; however, none has proven consistently accurate and there is still much uncertainty as to what technique suits which type of prediction problem. We believe that the underlying characteristicsÐsize, number(More)
Empirical studies on software prediction models do not converge with respect to the question "which prediction model is best?" The reason for this lack of convergence is poorly understood. In this simulation study, we have examined a frequently used research procedure comprising three main ingredients: a single data sample, an accuracy indicator, and cross(More)
Background--Self-evidently empirical analyses rely upon the quality of their data. Likewise, replications rely upon accurate reporting and using the same rather than similar versions of datasets. In recent years, there has been much interest in using machine learners to classify software modules into defect-prone and not defect-prone categories. The(More)