Samantha Riccadonna

Learn More
The number of available algorithms to infer a biological network from a dataset of high-throughput measurements is overwhelming and keeps growing. However, evaluating their performance is unfeasible unless a 'gold standard' is available to measure how close the reconstructed network is to the ground truth. One measure of this is the stability of these(More)
mlpy is a Python Open Source Machine Learning library built on top of NumPy/SciPy and the GNU Scientific Libraries. mlpy provides a wide range of state-of-the-art machine learning methods for supervised and unsupervised problems and it is aimed at finding a reasonable compromise among modularity, maintainability, reproducibility, usability and efficiency.(More)
We show that the Confusion Entropy, a measure of performance in multiclass problems has a strong (monotone) relation with the multiclass generalization of a classical metric, the Matthews Correlation Coefficient. Analytical results are provided for the limit cases of general no-information (n-face dice rolling) of the binary classification. Computational(More)
The search for predictive biomarkers of disease from high-throughput mass spectrometry (MS) data requires a complex analysis path. Preprocessing and machine-learning modules are pipelined, starting from raw spectra, to set up a predictive classifier based on a shortlist of candidate features. As a machine-learning problem, proteomic profiling on MS data(More)
UNLABELLED We introduce a novel implementation in ANSI C of the MINE family of algorithms for computing maximal information-based measures of dependence between two variables in large datasets, with the aim of a low memory footprint and ease of integration within bioinformatics pipelines. We provide the libraries minerva (with the R interface) and minepy(More)
INTRODUCTION The traditional staging system is inadequate to identify those patients with stage II colorectal cancer (CRC) at high risk of recurrence or with stage III CRC at low risk. A number of gene expression signatures to predict CRC prognosis have been proposed, but none is routinely used in the clinic. The aim of this work was to assess the(More)
Identifying the molecular pathways more prone to disruption during a pathological process is a key task in network medicine and, more in general, in systems biology. In this work we propose a pipeline that couples a machine learning solution for molecular profiling with a recent network comparison method. The pipeline can identify changes occurring between(More)
The Canberra distance is the sum of absolute values of the differences between ranks divided by their sum, thus it is a weighted version of the L1 distance. As a metric on permutation groups, the Canberra distance is a measure of disarray for ranked lists, where rank differences in top positions need to pay higher penalties than movements in the bottom part(More)
We present an interactive real-time visualization environment of time-ordered medical data aimed to support multidisciplinary disease management of patients with heart failure. Our prototype has been integrated into a working Electronic Patient Record implemented for the integrated management of heart failure patients. Since January 2005, the system has(More)
Due to the ever rising importance of the network paradigm across several areas of science, comparing and classifying graphs represent essential steps in the networks analysis of complex systems. Both tasks have been recently tackled via quite different strategies, even tailored ad-hoc for the investigated problem. Here we deal with both operations by(More)