Comparison of collapsing methods for the statistical analysis of rare variants

  title={Comparison of collapsing methods for the statistical analysis of rare variants},
  author={Carmen Dering and Andreas Ziegler and Inke Regina K{\"o}nig and Claudia Hemmelmann},
  journal={BMC Proceedings},
  pages={S115 - S115}
Novel technologies allow sequencing of whole genomes and are considered as an emerging approach for the identification of rare disease-associated variants. Recent studies have shown that multiple rare variants can explain a particular proportion of the genetic basis for disease. Following this assumption, we compare five collapsing approaches to test for groupwise association with disease status, using simulated data provided by Genetic Analysis Workshop 17 (GAW17). Variants are collapsed in… 
Identification of genetic association of multiple rare variants using collapsing methods
Overall, collapsing rare variants can increase the power of identifying disease‐associated genes, however, studying genetic associations of rare variants remains a challenging task that requires further development and improvement in data collection, management, analysis, and computation.
Regularized Rare Variant Enrichment Analysis for Case‐Control Exome Sequencing Data
This work proposes the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data and simultaneously evaluates the effects of rare variants in multiple genes, focusing on gene‐based least absolute shrinkage and selection operator (LASSO) and exon‐based sparse group LASSO models.
Do rare variant genotypes predict common variant genotypes?
N nominal evidence of correlation between rare and common variants in 21-30% of cases examined for unrelated individuals; this rate increased to 38-44% for related individuals, underscoring the segregation that underlies synthetic association.
Joint analyses of disease and correlated quantitative phenotypes using next‐generation sequencing data
Group 10 addressed the challenges and potential uses of next‐generation sequencing data to identify causal variants through a broad range of statistical methods and demonstrated in certain cases that performing a joint analysis of disease status and a quantitative trait can improve statistical power.
The Empirical Hierarchical Bayes Approach for Pathway Integration and Gene-Environment Interactions in Genome-Wide Association Studies
This thesis uses an empirical hierarchical Bayes model proposed for the integration of external information into genome-wide association studies to incorporate biological pathway information and provide a new test for gene environment interaction (GxE) by adapting the method for that purpose.
DNA repair in bladder cancer predisposition and radiotherapy treatment response
The contribution of DNA repair gene variants in bladder cancer risk and predicting radiotherapy response is demonstrated and could contribute to the goal of personalised medicine for targeted primary prevention, early diagnosis and treatment individualisation.


Statistical analysis of rare sequence variants: an overview of collapsing methods
This work provides an overview of these collapsing methods for association analysis and discusses the use of permutation approaches for significance testing of the data‐adaptive methods.
Evaluating methods for the analysis of rare variants in sequence data
Overall, it is found that all analyzed methods have serious practical limitations on identifying causal genes and Gametic phase disequilibrium and population stratification are important areas for further research in the analysis of rare variant data.
A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic
It is demonstrated that resequencing studies can identify important genetic associations, provided that specialised analysis methods, such as the weighted-sum method, are used.
Pooled association tests for rare variants in exon-resequencing studies.
An Evaluation of Statistical Approaches to Rare Variant Analysis in Genetic Association Studies
The results demonstrate that methods based on accumulations of rare variants discovered through re‐sequencing offer substantially greater power than conventional analysis of GWA data, and thus provide an exciting opportunity for future discovery of genetic determinants of complex traits.
Genetic Analysis Workshop 17 mini-exome simulation
The data set simulated for Genetic Analysis Workshop 17 was designed to mimic a subset of data that might be produced in a full exome screen for a complex disorder and related risk factors in order
PLINK: a tool set for whole-genome association and population-based linkage analyses.
This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Genomes: 1000 Genomes: a deep catalog of human genetic variation