Kevin K. Dobbin

Learn More
PURPOSE A common goal of gene expression microarray studies is the development of a classifier that can be used to divide patients into groups with different prognoses, or with different expected responses to a therapy. These types of classifiers are developed on a training set, which is the set of samples used to train a classifier. The question of how(More)
Many gene expression studies attempt to develop a predictor of pre-defined diagnostic or prognostic classes. If the classes are similar biologically, then the number of genes that are differentially expressed between the classes is likely to be small compared to the total number of genes measured. This motivates a two-step process for predictor development,(More)
BACKGROUND We consider the problem of designing a study to develop a predictive classifier from high dimensional data. A common study design is to split the sample into a training set and an independent test set, where the former is used to develop the classifier and the latter to evaluate its performance. In this paper we address the question of what(More)
BACKGROUND The intraclass correlation coefficient (ICC) is widely used in biomedical research to assess the reproducibility of measurements between raters, labs, technicians, or devices. For example, in an inter-rater reliability study, a high ICC value means that noise variability (between-raters and within-raters) is small relative to variability from(More)
Spontaneous canine head and neck squamous cell carcinoma (HNSCC) represents an excellent model of human HNSCC but is greatly understudied. To better understand and utilize this valuable resource, we performed a pilot study that represents its first genome-wide characterization by investigating 12 canine HNSCC cases, of which 9 are oral, via high density(More)
MOTIVATION Implementation and development of statistical methods for high-dimensional data often require high-dimensional Monte Carlo simulations. Simulations are used to assess performance, evaluate robustness, and in some cases for implementation of algorithms. But simulation in high dimensions is often very complex, cumbersome and slow. As a result,(More)
  • 1