• Publications
  • Influence
Repeated observation of breast tumor subtypes in independent gene expression data sets
Characteristic patterns of gene expression measured by DNA microarrays have been used to classify tumors into clinically relevant subgroups. In this study, we have refined the previously definedExpand
  • 5,011
  • 306
  • PDF
Supervised risk predictor of breast cancer based on intrinsic subtypes.
UNLABELLED PURPOSE To improve on current standards for breast cancer prognosis and prediction of chemotherapy benefit by developing a risk model that incorporates the gene expression-basedExpand
  • 3,117
  • 248
  • PDF
SiZer for Exploration of Structures in Curves
Abstract In the use of smoothing methods in data analysis, an important question is which observed features are “really there,” as opposed to being spurious sampling artifacts. An approach isExpand
  • 575
  • 106
An exact and easily computable expression for the mean integrated squared error (MISE) for the kernel estimator of a general normal mixture density, is given for Gaussian kernels of arbitrary order.Expand
  • 742
  • 100
Comprehensive genomic characterization of head and neck squamous cell carcinomas
The Cancer Genome Atlas profiled 279 head and neck squamous cell carcinomas (HNSCCs) to provide a comprehensive landscape of somatic genomic alterations. Here we show thatExpand
  • 2,006
  • 58
  • PDF
Sure independence screening for ultrahigh dimensional feature space Discussion
  • 153
  • 55
A Brief Survey of Bandwidth Selection for Density Estimation
Abstract There has been major progress in recent years in data-based bandwidth selection for kernel density estimation. Some “second generation” methods, including plug-in and smoothed bootstrapExpand
  • 1,189
  • 53
  • PDF
A method for normalizing histology slides for quantitative analysis
This paper focuses on the use of standard 24- bit RGB cameras to obtain images, so the methodology is restricted to those three wavelengths of light. Expand
  • 470
  • 50
  • PDF
Geometric representation of high dimension, low sample size data
High dimension, low sample size data are emerging in various areas of science. We find a common structure underlying many such data sets by using a non-standard type of asymptotics: the dimensionExpand
  • 439
  • 48
  • PDF
Predicting Fault Incidence Using Software Change History
This paper is an attempt to understand the processes by which software ages. Expand
  • 745
  • 46
  • PDF