• Publications
  • Influence
Clustering cancer gene expression data: a comparative study
TLDR
This study presents the first large-scale analysis of seven different clustering methods and four proximity measures for the analysis of 35 cancer gene expression data sets and reveals that the finite mixture of Gaussians, followed closely by k-means, exhibited the best performance in terms of recovering the true structure of the data sets.
Ranking and selecting clustering algorithms using a meta-learning approach
TLDR
A novel framework that applies a meta-learning approach to clustering algorithms that provides a ranking for the candidate algorithms that could be used with that dataset in the context of cancer gene expression micro-array datasets is presented.
Multi-objective clustering ensemble for gene expression data analysis
TLDR
An algorithm for cluster analysis that integrates aspects from cluster ensemble and multi-Objective clustering, based on a Pareto-based multi-objective genetic algorithm, with a special crossover operator, which uses clustering validation measures as objective functions.
Information-theoretic clustering: A representative and evolutionary approach
TLDR
A new cost evaluation function for clustering that measures the cross information potential (CIP) between clusters on a dataset using representative points, called representative CIP (rCIP), which created a useful non-parametric estimator of entropy and makes possible usingCross information potential in applications where it was not.
Comparative study on dimension reduction techniques for cluster analysis of microarray data
TLDR
Overall results showed that using DRTs provides a improvement in performance of all algorithms tested, specially in the hierarchical class, and Principal Component Analysis was overcome by other nonlinear methods and it did not provide a substantial performance increase in the clustering algorithms.
Clustering Using Elements of Information Theory
TLDR
The proposed algorithm uses "classical" clustering algorithms to initialize some small regions (auxiliary clusters) that will be merged to construct the final clusters, which was tested using several databases with different spatial distributions.
Representative cross information potential clustering
TLDR
An information-theoretic approach for clustering with a new measure of cross information potential and two clustering algorithms which explore the idea of creating links between regions of the feature space that are highly correlated.
Fusion Approaches of Feature Selection Algorithms for Classification Problems
TLDR
An analysis of two distinct approaches of combining feature selection algorithms (decision and data fusion) was made in supervised classification context using real and synthetic datasets and showed that one proposed approach has achieved the best results for the majority of datasets.
Comparative study on normalization procedures for cluster analysis of gene expression datasets
TLDR
A first large scale data driven comparative study of three normalization procedures applied to cancer gene expression data is presented in terms of the recovering of the true cluster structure as found by five different clustering algorithms.
Using Big Data and Real-Time Analytics to Support Smart City Initiatives
Abstract: A central issue in the context of smart cities is related to the capability to acquire timely information about city events. This paper describes a platform which focuses on processing
...
1
2
3
...