Learn More
MOTIVATION A microarray experiment is a multi-step process, and each step is a potential source of variation. There are two major sources of variation: biological variation and technical variation. This study presents a variance-components approach to investigating animal-to-animal, between-array, within-array and day-to-day variations for two data sets.(More)
A class-imbalanced classifier is a decision rule to predict the class membership of new samples from an available data set where the class sizes differ considerably. When the class sizes are very different, most standard classification algorithms may favor the larger (majority) class resulting in poor accuracy in the minority class prediction. A(More)
OBJECTIVE Personalized medicine is defined by the use of genomic signatures of patients in a target population for assignment of more effective therapies as well as better diagnosis and earlier interventions that might prevent or delay disease. An objective is to find a novel classification algorithm that can be used for prediction of response to therapy in(More)
MOTIVATION Gene class testing (GCT) or gene set analysis (GSA) is a statistical approach to determine whether some functionally predefined sets of genes express differently under different experimental conditions. Shortcomings of the Fisher's exact test for the overrepresentation analysis are illustrated by an example. Most alternative GSA methods are(More)
A robust classification procedure is developed based on ensembles of classifiers, with each classifier constructed from a different set of predictors determined by a random partition of the entire set of predictors. The proposed methods combine the results of multiple classifiers to achieve a substantially improved prediction compared to the optimal single(More)
MOTIVATION Microarray experiments often involve hundreds or thousands of genes. In a typical experiment, only a fraction of genes are expected to be differentially expressed; in addition, the measured intensities among different genes may be correlated. Depending on the experimental objectives, sample size calculations can be based on one of the three(More)
Recent development of high-throughput technology has accelerated interest in the development of molecular biomarker classifiers for safety assessment, disease diagnostics and prognostics, and prediction of response for patient assignment. This article reviews and evaluates some important aspects and key issues in the development of biomarker classifiers.(More)
MOTIVATION Gene class testing (GCT) is a statistical approach to determine whether some functionally predefined classes of genes express differently under two experimental conditions. GCT computes the P-value of each gene class based on the null distribution and the gene classes are ranked for importance in accordance with their P-values. Currently, two(More)
MOTIVATION One major area of interest in analyzing oligonucleotide gene array data is identifying differentially expressed genes. A challenge to biostatisticians is to develop an approach to summarizing probe-level information that adequately reflects the true expression level while accounting for probe variation, chip variation and interaction effects.(More)