• Publications
  • Influence
A general framework for estimating the relative pathogenicity of human genetic variants
Current methods for annotating and interpreting human genetic variation tend to exploit a single information type (for example, conservation) and/or are restricted in scope (for example, to missenseExpand
  • 3,380
  • 437
  • PDF
An introduction to statistical learning
Statistics An Intduction to Stistical Lerning with Applications in R An Introduction to Statistical Learning provides an accessible overview of the fi eld of statistical learning, an essentialExpand
  • 4,041
  • 416
  • PDF
A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis.
We present a penalized matrix decomposition (PMD), a new framework for computing a rank-K approximation for a matrix. We approximate the matrix X as circumflexX = sigma(k=1)(K) d(k)u(k)v(k)(T), whereExpand
  • 1,107
  • 152
  • PDF
The joint graphical lasso for inverse covariance estimation across multiple classes.
We consider the problem of estimating multiple related Gaussian graphical models from a high-dimensional data set with observations belonging to distinct classes. We propose the joint graphicalExpand
  • 506
  • 105
  • PDF
A Framework for Feature Selection in Clustering
We consider the problem of clustering observations using a potentially large set of features. One might expect that the true underlying clusters present in the data differ only with respect to aExpand
  • 415
  • 69
  • PDF
Extensions of Sparse Canonical Correlation Analysis with Applications to Genomic Data
In recent work, several authors have introduced methods for sparse canonical correlation analysis (sparse CCA). Suppose that two sets of measurements are available on the same set of observations.Expand
  • 345
  • 51
  • PDF
Penalized classification using Fisher's linear discriminant.
  • D. Witten, R. Tibshirani
  • Medicine, Mathematics
  • Journal of the Royal Statistical Society. Series…
  • 1 November 2011
We consider the supervised classification setting, in which the data consist of p features measured on n observations, each of which belongs to one of K classes. Linear discriminant analysis (LDA) isExpand
  • 287
  • 48
  • PDF
CADD: predicting the deleteriousness of variants throughout the human genome
Abstract Combined Annotation-Dependent Depletion (CADD) is a widely used measure of variant deleteriousness that can effectively prioritize causal variants in genetic analyses, particularly highlyExpand
  • 480
  • 32
  • PDF
Hierarchical maintenance of MLL myeloid leukemia stem cells employs a transcriptional program shared with embryonic rather than adult stem cells.
The genetic programs that promote retention of self-renewing leukemia stem cells (LSCs) at the apex of cellular hierarchies in acute myeloid leukemia (AML) are not known. In a mouse model of humanExpand
  • 290
  • 28
New Insights and Faster Computations for the Graphical Lasso
We consider the graphical lasso formulation for estimating a Gaussian graphical model in the high-dimensional setting. This approach entails estimating the inverse covariance matrix under aExpand
  • 244
  • 25
  • PDF