Andrei Yakovlev

Learn More
BACKGROUND Microarray gene expression data are commonly perceived as being extremely noisy because of many imperfections inherent in the current technology. A recent study conducted by the MicroArray Quality Control (MAQC) Consortium and published in Nature Biotechnology provides a unique opportunity to probe into the true level of technical noise in such(More)
BACKGROUND The number of genes declared differentially expressed is a random variable and its variability can be assessed by resampling techniques. Another important stability indicator is the frequency with which a given gene is selected across subsamples. We have conducted studies to assess stability and some other properties of several gene selection(More)
Stochastic dependence between gene expression levels in microarray data is of critical importance for the methods of statistical inference that resort to pooling test statistics across genes. The empirical Bayes methodology in the nonparametric and parametric formulations, as well as closely related methods employing a two-component mixture model, represent(More)
BACKGROUND Stochastic dependence between gene expression levels in microarray data is of critical importance for the methods of statistical inference that resort to pooling test-statistics across genes. It is frequently assumed that dependence between genes (or tests) is sufficiently weak to justify the proposed methods of testing for differentially(More)
Understanding the molecular underpinnings of cancer is of critical importance to the development of targeted intervention strategies. Identification of such targets, however, is notoriously difficult and unpredictable. Malignant cell transformation requires the cooperation of a few oncogenic mutations that cause substantial reorganization of many cell(More)
A test-statistic typically employed in the gene set enrichment analysis (GSEA) prevents this method from being genuinely multivariate. In particular, this statistic is insensitive to changes in the correlation structure of the gene sets of interest. The present paper considers the utility of an alternative test-statistic in designing the confirmatory(More)
BACKGROUND To identify differentially expressed genes, it is standard practice to test a two-sample hypothesis for each gene with a proper adjustment for multiple testing. Such tests are essentially univariate and disregard the multidimensional structure of microarray data. A more general two-sample hypothesis is formulated in terms of the joint(More)
BACKGROUND Microarray technology is commonly used as a simple screening tool with a focus on selecting genes that exhibit extremely large differential expressions between different phenotypes. It lacks the ability to select genes that change their relationships with other genes in different biological conditions (differentially correlated genes). We intend(More)
We introduce a nonparametric test intended for large-scale simultaneous inference in situations where the utility of distribution-free tests is limited because of their discrete nature. Such situations are frequently dealt with in microarray analysis where the number of tests is much larger than the sample size. The proposed test statistic is based on a(More)
Some extended false discovery rate (FDR) controlling multiple testing procedures rely heavily on empirical estimates of the FDR constructed from gene expression data. Such estimates are also used as performance indicators when comparing different methods for microarray data analysis. The present communication shows that the variance of the proposed(More)