Learn More
Bias-variance analysis provides a tool to study learning algorithms and can be used to properly design ensemble methods well tuned to the properties of a specific base learner. Indeed the effectiveness of ensemble methods critically depends on accuracy, diversity and learning characteristics of base learners. We present an extended experimental analysis of(More)
Recently, bias-variance decomposition of error has been used as a tool to study the behavior of learning algorithms and to develop new ensemble methods well suited to the bias-variance characteristics of base learners. We propose methods and procedures, based on Domingo's unified bias-variance theory, to evaluate and quantitatively measure the bias-variance(More)
Expression-based classification of tumors requires stable, reliable and variance reduction methods, as DNA microarray data are characterized by low size, high di-mensionality, noise and large biological variability. In order to address the variance and curse of dimensionality problems arising from this difficult task, we propose to apply bagged ensembles of(More)
BACKGROUND Cluster analysis has been widely applied for investigating structure in bio-molecular data. A drawback of most clustering algorithms is that they cannot automatically detect the "natural" number of clusters underlying the data, and in many cases we have no enough "a priori" biological knowledge to evaluate both the number of clusters as well as(More)
UNLABELLED The R package mosclust (model order selection for clustering problems) implements algorithms based on the concept of stability for discovering significant structures in bio-molecular data. The software library provides stability indices obtained through different data perturbations methods (resampling, random projections, noise injection), as(More)
A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for(More)
The ranking and prediction of novel therapeutic categories for existing drugs (drug repositioning) is a challenging computational problem involving the analysis of complex chemical and biological networks. In this context we propose a novel semi-supervised learning problem: ranking drugs in integrated biochemical networks according to specific DrugBank(More)