Learn More
The purpose of this study was to classify breast carcinomas based on variations in gene expression patterns derived from cDNA microarrays and to correlate tumor characteristics to clinical outcome. A total of 85 cDNA microarray experiments representing 78 cancers, three fibroadenomas, and four normal breast tissues were analyzed by hierarchical clustering.(More)
Characteristic patterns of gene expression measured by DNA microarrays have been used to classify tumors into clinically relevant subgroups. In this study, we have refined the previously defined subtypes of breast tumors that could be distinguished by their distinct patterns of gene expression. A total of 115 malignant breast tumors were analyzed by(More)
In the words of the authors, the goal of this book was to " bring together many of the important new ideas in learning, and explain them in a statistical framework. " The authors have been quite successful in achieving this objective and their work will be a welcome addition to the statistics and learning literatures. Statistics has always been an(More)
BACKGROUND Low-molecular-weight heparins may simplify the management of deep venous thrombosis. A critical clinical issue is whether this more convenient therapy is as safe and effective as treatment with unfractionated heparin. PURPOSE To compare the safety and efficacy of low-molecular-weight heparins with those of unfractionated heparin for treatment(More)
We have devised an approach to cancer class prediction from gene expression profiling, based on an enhancement of the simple nearest prototype (centroid) classifier. We shrink the prototypes and hence obtain a classifier that is often more accurate than competing methods. Our method of "nearest shrunken centroids" identifies subsets of genes that best(More)
We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, two-class logistic regression, and multinomial regression problems while the penalties include ℓ(1) (the lasso), ℓ(2) (ridge regression) and mixtures of the two (the elastic net). The algorithms use cyclical coordinate descent,(More)
MOTIVATION Gene expression microarray experiments can generate data sets with multiple missing expression values. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene array values as input. For example, methods such as hierarchical clustering and K-means clustering are not robust to missing data, and may lose(More)
The purpose of model selection algorithms such as All Subsets, Forward Selection and Backward Elimination is to choose a linear model on the basis of the same set of data to which the model will be applied. Typically we have available a large collection of possible covariates from which we hope to select a parsimonious set for the efficient prediction of a(More)