Learn More
Predictive clustering is a general framework that unifies clustering and prediction. This paper investigates how to apply this framework to cluster time series data. The resulting system, Clus-TS, constructs predictive clustering trees (PCTs) that partition a given set of time series into homogeneous clusters. In addition, PCTs provide a symbolic(More)
Nm23-H1 is one of the most interesting candidate genes for a relevant role in Neuroblastoma pathogenesis. H-Prune is the most characterized Nm23-H1 binding partner, and its overexpression has been shown in different human cancers. Our study focuses on the role of the Nm23-H1/h-Prune protein complex in Neuroblastoma. Using NMR spectroscopy, we performed a(More)
In biology, analyzing time course data is usually a two-step process, beginning with clustering of similar temporal profiles. After the initial clustering, depending on the expert's knowledge, descriptions of the clusters are elucidated (e.g., Gene Ontology terms that are enriched in the clusters). In this paper, we investigate the application of so-called(More)
In this paper we investigate the problem of evaluating ranked lists of biomarkers, which are typically an output of the analysis of high-throughput data. This can be a list of probes from microarray experiments, which are ordered by the strength of their correlation to a disease. Usually, the ordering of the biomarkers in the ranked lists varies a lot if(More)
In this paper a querying environment for analysis of patient clinical data is presented. The data consists of two parts: patients' pathological data and data about corresponding gene expression levels. The querying environment includes a generic algorithm for constructing decision trees, as well as algorithms for discretizing gene expression levels and for(More)
In the recent years, the data available for analysis in machine learning is becoming very high-dimensional and also structured in a more complex way. This emphasises the need for developing machine learning algorithms that are able to tackle both the high-dimensionality and the complex structure of the data. Our work in this paper, focuses on extending a(More)
In this paper, we provide initial Data Mining results on four sets of genetic data, collected in the context of the new European Embryonal Tumour Pipeline project. These data sets provide different views on the genetic processes involved in the genesis and development of a specific type of tumour, known as neuroblastoma. Although the project involves other(More)
In this work, we present a feature ranking method for multi-label data. The method is motivated by the the practically relevant multi-label applications, such as semantic annotation of images and videos, functional genomics, music and text categorization etc. We propose a feature ranking method based on random forests. Considering the success of the feature(More)
The Kilobot is a widely used platform for investigation of swarm robotics. Physical Kilobots are slow moving and require frequent recalibration and charging , which significantly slows down the development cycle. Simulators can speed up the process of testing, exploring and hypothesis generation, but usually require time consuming and error-prone(More)