Learn More
The completion of the genome sequence of Plasmodium falciparum revealed that close to 60% of the annotated genome corresponds to hypothetical proteins and that many genes, whose metabolic pathways or biological products are known, have not been predicted from sequence similarity searches. Recently, using global gene expression of the asexual blood stages of(More)
For small samples, classifier design algorithms typically suffer from overfitting. Given a set of features, a classifier must be designed and its error estimated. For small samples, an error estimator may be unbiased but, owing to a large variance, often give very optimistic estimates. This paper proposes mitigating the small-sample problem by designing(More)
The genome is a highly complex nonlinear control system regulating cell function. One of the primary means for regulating cellular activity is the control of protein production. Protein production is controlled by the amount of mRNA expressed by individual genes. This level of gene expression is modulated by protein machinery that senses conditions internal(More)
There are many algorithms to cluster sample data points based on nearness or a similarity measure. Often the implication is that points in different clusters come from different underlying classes, whereas those in the same cluster come from the same class. Stochastically, the underlying classes represent different random processes. The inference is that(More)
D NA chips (i.e., microarrays) biotechnology is a hybridization (i.e., matching of pairs of DNA)-based process that makes possible to quantify the relative abundance of mRNA of two distinct samples by analyzing their fluorescence signals. This technique requires robotic placement (i.e., spotting) of thousands of cDNAs (i.e., complementary DNA) in an array(More)
The cDNA microarray technology allows us to estimate the expression of thousands of genes of a given tissue. It is natural then to use such information to classify different cell states, like healthy or diseased, or one particular type of cancer or another. However, usually the number of microarray samples is very small and leads to a classification problem(More)