Learn More
Genomic data integration is a key goal to be achieved towards large-scale genomic data analysis. This process is very challenging due to the diverse sources of information resulting from genomics experiments. In this work, we review methods designed to combine genomic data recorded from microarray gene expression (MAGE) experiments. It has been acknowledged(More)
Genomics datasets are increasingly useful for gaining biomedical insights, with adoption in the clinic underway. However, multiple hurdles related to data management stand in the way of their efficient large-scale utilization. The solution proposed is a web-based data storage hub. Having clear focus, flexibility and adaptability, InSilico DB seamlessly(More)
An increasing amount of microarray gene expression data sets is available through public repositories. Their huge potential in making new findings is yet to be unlocked by making them available for large-scale analysis. In order to do so it is essential that independent studies designed for similar biological problems can be integrated, so that new insights(More)
A plenitude of feature selection (FS) methods is available in the literature, most of them rising as a need to analyze data of very high dimension, usually hundreds or thousands of variables. Such data sets are now available in various application areas like combinatorial chemistry, text mining, multivariate imaging, or bioinformatics. As a general accepted(More)
With an abundant amount of microarray gene expression data sets available through public repositories, new possibilities lie in combining multiple existing data sets. In this new context, analysis itself is no longer the problem, but retrieving and consistently integrating all this data before delivering it to the wide variety of existing analysis tools(More)
Microarray technology has become an integral part of biomedical research and increasing amounts of datasets become available through public repositories. However, re-use of these datasets is severely hindered by unstructured, missing or incorrect biological samples information; as well as the wide variety of preprocessing methods in use. The inSilicoDb(More)
The clustering capabilities of the Non Negative Matrix Factorization algorithm is studied. The basis images are considered like the membership degree of the data to a particular class. A hard clustering algorithm is easily derived based on these images. This algorithm is applied on a multivariate image to perform image segmentation. The results are compared(More)
The potential of microarray gene expression (MAGE) data is only partially explored due to the limited number of samples in individual studies. This limitation can be surmounted by merging or integrating data sets originating from independent MAGE experiments, which are designed to study the same biological problem. However, this process is hindered by batch(More)
— Analyzing unknown data sets such as multi-spectral images often requires unsupervised techniques. Data clustering is a well known and widely used approach in such cases. Multi-spectral image segmentation requires pixel classification according to a similarity criterion. For this particular data , partitional clustering seems to be more appropriate.(More)
Some clustering algorithms require assumptions (such as number and shape of classes), which limit their performances or provide wrong results. On the contrary, methods based on the estimation of the probability density function (pdf) do not make any assumption neither on the classes shape nor on their number. Two methods based on the pdf, are explored and(More)