Learn More
MOTIVATION Clinical data, such as patient history, laboratory analysis, ultrasound parameters--which are the basis of day-to-day clinical decision support--are often underused to guide the clinical management of cancer in the presence of microarray data. We propose a strategy based on Bayesian networks to treat clinical and microarray data on an equal(More)
MOTIVATION Microarrays are capable of determining the expression levels of thousands of genes simultaneously. In combination with classification methods, this technology can be useful to support clinical management decisions for individual patients, e.g. in oncology. The aim of this paper is to systematically benchmark the role of non-linear versus linear(More)
MOTIVATION Microarray experiments generate a considerable amount of data, which analyzed properly help us gain a huge amount of biologically relevant information about the global cellular behaviour. Clustering (grouping genes with similar expression profiles) is one of the first steps in data analysis of high-throughput expression measurements. A number of(More)
INCLUSive allows automatic multistep analysis of microarray data (clustering and motif finding). The clustering algorithm (adaptive quality-based clustering) groups together genes with highly similar expression profiles. The upstream sequences of the genes belonging to a cluster are automatically retrieved from GenBank and can be fed directly into Motif(More)
INCLUSive is a suite of algorithms and tools for the analysis of gene expression data and the discovery of cis-regulatory sequence elements. The tools allow normalization, filtering and clustering of microarray data, functional scoring of gene clusters, sequence retrieval, and detection of known and unknown regulatory elements using probabilistic sequence(More)
EnsembleSVM is a free software package containing efficient routines to perform ensemble learning with support vector machine (SVM) base models. It currently offers ensemble methods based on binary SVM models. Our implementation avoids duplicate storage and evaluation of support vectors which are shared between constituent models. Experimental results show(More)
Microarray classification can be useful to support clinical management decisions for individual patients in, for example, oncology. However, comparing classifiers and selecting the best for each microarray dataset can be a tedious and non-straightforward task. The M@CBETH (a MicroArray Classification BEnchmarking Tool on a Host server) web service offers(More)
We present an approximation scheme for support vector machine models that use an RBF kernel. A second-order Maclaurin series approximation is used for exponentials of inner products between support vectors and test instances. The approximation is applicable to all kernel methods featuring sums of kernel evaluations and makes no assumptions regarding data(More)
We present a novel approach to learn binary classifiers when only positive and unlabeled instances are available (PU learning). This problem is routinely cast as a supervised task with label noise in the negative set. We use an ensemble of SVM models trained on bootstrap subsamples of the training data for increased robustness against label noise. The(More)
Assessing the performance of a learned model is a crucial part of machine learning. Most evaluation metrics can only be computed with labeled data. Unfortunately, in many domains we have many more unlabeled than labeled examples. Furthermore, in some domains only positive and unlabeled examples are available, in which case most standard metrics cannot be(More)