• Publications
  • Influence
Data Mining Static Code Attributes to Learn Defect Predictors
TLDR
It is shown that static code attributes used to build defect predictors are much more important than which particular attributes are used, and contrary to prior pessimism, they are demonstrably useful and yield predictors with a mean probability of detection and mean false alarms rates.
On the relative value of cross-company and within-company data for defect prediction
TLDR
It is demonstrated in this paper that the minimum number of data samples required to build effective defect predictors can be quite small and can be collected quickly within a few months.
Automated severity assessment of software defect reports
TLDR
The paper presents a new and automated method named SEVERIS (severity issue assessment), which assists the test engineer in assigning severity levels to defect reports, based on standard text mining and machine learning techniques applied to existing sets of defect reports.
On the value of user preferences in search-based software engineering: A case study in software product lines
TLDR
The conclusion is that search-based software engineering methods need to change, particularly when studying complex decision spaces, since methods in widespread use perform much worse than IBEA (Indicator-Based Evolutionary Algorithm).
On the Value of Ensemble Effort Estimation
TLDR
While there is no best single effort estimation method, there exist best combinations of such effort estimation methods.
Scalable product line configuration: A straw to break the camel's back
TLDR
This paper presents simple heuristics that help the Indicator-Based Evolutionary Algorithm (IBEA) in finding sound and optimum configurations of very large variability models in the presence of competing objectives.
Better cross company defect prediction
TLDR
This paper finds that: 1) within-company predictors are weak for small data-sets; 2) the Peters filter+cross-company builds better predictors than both within- company and the Burak filter+Cross-company; and 3) the PETS filter builds 64% more useful predictor than bothWithin-company and theBurak filter-cross- company approaches.
Heterogeneous Defect Prediction
TLDR
This paper identifies categories of data sets were as few as 50 instances are enough to build a defect prediction model, and shows that empirically and theoretically, “large enough” may be very small indeed.
...
...