Learn More
The epidemiological question of concern here is " can young children at risk of obesity be identified from their early growth records? " Pilot work using logistic regression to predict overweight and obese children demonstrated relatively limited success. Hence we investigate the incorporation of non-linear interactions to help improve accuracy of(More)
Accelerating the learning curve of software maintainers working on systems with which they have little familiarity motivated this study. A working hypothesis was that automated methods are needed to provide a fast, rough grasp of a system, to enable practitioners not familiar with it, to commence maintenance with a level of confidence as if they had this(More)
Data mining and its capacity to deal with large volumes of data and to uncover hidden patterns has been proposed as a means to support industrial scale software maintenance and comprehension. This paper presents a methodology for knowledge acquisition from source code in order to comprehend an object-oriented system and evaluate its maintainability. We(More)
Program comprehension is an important part of software maintenance, especially when program structure is complex and documentation is unavailable or outdated. Data mining can produce structural views of source code thus facilitating legacy systems understanding. This paper presents a method for mining association rules from code aiming at capturing program(More)
This work proposes a methodology for source code quality and static behaviour evaluation of a software system, based on the standard ISO/IEC-9126. It uses elements automatically derived from source code enhanced with expert knowledge in the form of quality characteristic rankings, allowing software engineers to assign weights to source code attributes. It(More)
Clustering is particularly useful in problems where there is little prior information about the data under analysis. This is usually the case when attempting to evaluate a software system's maintainability, as many dimensions must be taken into account in order to reach a conclusion. On the other hand partitional clustering algorithms suffer from being(More)
Molecular dynamics simulations provide a sample of a molecule's conformational space. Experiments on the mus time scale, resulting in large amounts of data, are nowadays routine. Data mining techniques such as classification provide a way to analyse such data. In this work, we evaluate and compare several classification algorithms using three data sets(More)
Data mining is a technology recently used in support of software maintenance in various contexts. Our works focuses on achieving a high level understanding of Java systems without prior familiarity with these. Our thesis is that system structure and interrelationships, as well as similarities among program components can be derived by applying cluster(More)
In recent years interest has grown in " mining " large databases to extract novel and interesting information. Knowledge Discovery in Databases (KDD) has been recognised as an emerging research area. Association rules discovery is an important KDD technique for better data understanding. This paper proposes an enhancement with a memory efficient data(More)