Learn More
We describe factored spectrally transformed linear mixed models (FaST-LMM), an algorithm for genome-wide association studies (GWAS) that scales linearly with cohort size in both run time and memory use. On Wellcome Trust data for 15,000 individuals, FaST-LMM ran an order of magnitude faster than current efficient algorithms. Our algorithm can analyze data(More)
In epigenome-wide association studies, cell-type composition often differs between cases and controls, yielding associations that simply tag cell type rather than reveal fundamental biology. Current solutions require actual or estimated cell-type composition--information not easily obtainable for many samples of interest. We propose a method,(More)
Promiscuous binding of T helper epitopes to MHC class II molecules has been well established, but few examples of promiscuous class I-restricted epitopes exist. To address the extent of promiscuity of HLA class I peptides, responses to 242 well-defined viral epitopes were tested in 100 subjects regardless of the individuals' HLA type. Surprisingly, half of(More)
Hereditary predisposition and causative environmental exposures have long been recognized in human malignancies. In most instances, cancer cases occur sporadically, suggesting that environmental influences are critical in determining cancer risk. To test the influence of genetic polymorphisms on breast cancer risk, we have measured 98 single nucleotide(More)
CRISPR-Cas9-based genetic screens are a powerful new tool in biology. By simply altering the sequence of the single-guide RNA (sgRNA), one can reprogram Cas9 to target different sites in the genome with relative ease, but the on-target activity and off-target effects of individual sgRNAs can vary widely. Here, we use recently devised sgRNA design rules to(More)
Multiple realizations of continuous-valued time series from a stochastic process often contain systematic variations in rate and amplitude. To leverage the information contained in such noisy replicate sets, we need to align them in an appropriate way (for example, to allow the data to be properly combined by adaptive averaging). We present the Continuous(More)
Supplementary Figure 1 A comparison of P values for the algorithms described in Table 1 Supplementary Table 1 SNPs found to be significant by at least one of the algorithms in Table 1 Supplementary Methods Supplementary methods Supplementary Note 1 Experiments with synthetic data
Understanding the genetic underpinnings of disease is important for screening, treatment, drug development, and basic biological insight. One way of getting at such an understanding is to find out which parts of our DNA, such as single-nucleotide polymorphisms, affect particular intermediary processes such as gene expression. Naively, such associations can(More)
MOTIVATION Approaches for testing sets of variants, such as a set of rare or common variants within a gene or pathway, for association with complex traits are important. In particular, set tests allow for aggregation of weak signal within a set, can capture interplay among variants and reduce the burden of multiple hypothesis testing. Until now, these(More)
MOTIVATION Set-based variance component tests have been identified as a way to increase power in association studies by aggregating weak individual effects. However, the choice of test statistic has been largely ignored even though it may play an important role in obtaining optimal power. We compared a standard statistical test-a score test-with a recently(More)