Mohammad Shahrokh Esfahani

Learn More
Small samples are commonplace in genomic/proteomic classification, the result being inadequate classifier design and poor error estimation. The problem has recently been addressed by utilizing prior knowledge in the form of a prior distribution on an uncertainty class of feature-label distributions. A critical issue remains: how to incorporate biological(More)
MOTIVATION Measurements are commonly taken from two phenotypes to build a classifier, where the number of data points from each class is predetermined, not random. In this 'separate sampling' scenario, the data cannot be used to estimate the class prior probabilities. Moreover, predetermined class sizes can severely degrade classifier performance, even for(More)
BACKGROUND Accumulation of gene mutations in cells is known to be responsible for tumor progression, driving it from benign states to malignant states. However, previous studies have shown that the detailed sequence of gene mutations, or the steps in tumor progression, may vary from tumor to tumor, making it difficult to infer the exact path that a given(More)
Circulating tumour DNA (ctDNA) analysis facilitates studies of tumour heterogeneity. Here we employ CAPP-Seq ctDNA analysis to study resistance mechanisms in 43 non-small cell lung cancer (NSCLC) patients treated with the third-generation epidermal growth factor receptor (EGFR) inhibitor rociletinib. We observe multiple resistance mechanisms in 46% of(More)
Contemporary high-throughput technologies provide measurements of very large numbers of variables but often with very small sample sizes. This paper proposes an optimization-based paradigm for utilizing prior knowledge to design better performing classifiers when sample sizes are limited. We derive approximate expressions for the first and second moments of(More)
Phenotype classification via genomic data is hampered by small sample sizes that negatively impact classifier design. Utilization of prior biological knowledge in conjunction with training data can improve both classifier design and error estimation via the construction of the optimal Bayesian classifier. In the genomic setting, gene/protein signaling(More)
We propose a novel optimization-based paradigm for designing enhanced classifiers. The proposed paradigm allows us to incorporate available prior process knowledge into classifier design, thereby improving the performance of the resulting classifiers. In this work, we focus on dynamical systems that can be represented as finite-state multi-dimensional(More)
It is commonplace in bioinformatics (and elsewhere) to build a classifier from sample data in which the sample sizes of the classes are not random; that is, they are selected prior to sampling. The result is that there is no estimate of the prior class probabilities available from the data. In this paper, we find an analytic result for the minimax solution(More)
In many contemporary engineering problems, model uncertainty is inherent because accurate system identification is virtually impossible owing to system complexity or lack of data on account of availability, time, or cost. The situation can be treated by assuming that the true model belongs to an uncertainty class of models. In this context, an intrinsically(More)