GAPIT Version 2: An Enhanced Integrated Tool for Genomic Association and Prediction

  title={GAPIT Version 2: An Enhanced Integrated Tool for Genomic Association and Prediction},
  author={You Tang and Xiaolei Liu and Jiabo Wang and Meng Li and Qishan Wang and Feng Tian and Zhongbin Su and Yu-chun Pan and Di Liu and Alexander E. Lipka and Edward S. Buckler and Zhiwu Zhang},
  journal={The Plant Genome},
Most human diseases and agriculturally important traits are complex. [] Key Result These methods include factored spectrally transformed linear mixed models (FaST-LMM), enriched CMLM (ECMLM), FaST-LMM-Select, and settlement of mixed linear models under progressively exclusive relationship (SUPER). The genomic prediction methods implemented in this new release of the GAPIT include gBLUP based on CMLM, ECMLM, and SUPER.

Method for Genome-Wide Association Study: A Soybean Example.

An example downloading and filtering SNP data, followed by GWAS analysis using the R-package rMVP is provided.

Status and prospects of genome‐wide association studies in plants

This work states that the development of the mixed model framework for GWAS dramatically reduced the number of false positives compared with naïve methods and many methods have since been developed to increase computational speed or improve statistical power in GWAS.

iPat: intelligent prediction and association tool for genomic research

A GWAS-assisted genomic prediction method was implemented to perform genomic prediction using any GWAS method such as FarmCPU, and a user-friendly graphical user interface was developed, named the Intelligent Prediction and Association Tool (iPat).

SNPViz v2.0: A web-based tool for enhanced haplotype analysis using large scale resequencing datasets and discovery of phenotypes causative gene using allelic variations

SNPViz v2.0 is a web-based tool to visualize large-scale haplotype blocks with detailed SNPs and Indels grouped by their chromosomal coordinates, along with their overlapping gene models, phenotype to genotype accuracies, Gene Ontology (GO) annotations, protein families (Pfam) Annotations, genomic variant annotations, and their functional effects.

From Hype to Hope: Genome-Wide Association Studies in Soybean

Major progress in understanding population structure, advancements in design, and implementation of association mapping are described and examples of association maps in soybean are summarized and major opportunities with potential implications in soybeans are discussed.

Genetic architecture and genomic prediction accuracy of apple quantitative traits across environments

This most comprehensive genomic study in apple in terms of trait-environment combinations provided knowledge of trait biology and prediction models that can be readily applied for marker-assisted or genomic selection, thus facilitating increased breeding efficiency.

Dissecting Complex Traits Using Omics Data: A Review on the Linear Mixed Models and Their Application in GWAS

This review has demonstrated all possible LMMs-based methods available in the literature for GWAS and includes the advantages and weaknesses of the LMMs in GWAS.



GAPIT: genome association and prediction integrated tool

An R package called GAPIT is developed that implements advanced statistical methods including the compressed mixed linear model (CMLM) and CMLM-based genomic prediction and selection and can handle large datasets in excess of 10 000 individuals and 1 million single-nucleotide polymorphisms with minimal computational time.

Towards sequence-based genomic selection of cattle

An international effort to resequence the genomes of a large number of key ancestor bulls of the most important domestic cattle breeds based on the analysis of the first 234 bovine whole-genome sequences reports on the first results.

TASSEL: software for association mapping of complex traits in diverse samples

TASSEL (Trait Analysis by aSSociation, Evolution and Linkage) implements general linear model and mixed linear model approaches for controlling population and family structure and allows for linkage disequilibrium statistics to be calculated and visualized graphically.

An efficient multi-locus mixed model approach for genome-wide association studies in structured populations

Simulations suggest that the proposed multi-locus mixed model as a general method for mapping complex traits in structured populations outperforms existing methods in terms of power as well as false discovery rate.

A mixed-model approach for genome-wide association studies of correlated traits in structured populations

This work extends this linear mixed-model approach to carry out GWAS of correlated phenotypes, deriving a fully parameterized multi-Trait mixed model (MTMM) that considers both the within-trait and between-traits variance components simultaneously for multiple traits.

Prioritizing GWAS results: A review of statistical methods and recommendations for their application.

Using Whole-Genome Sequence Data to Predict Quantitative Trait Phenotypes in Drosophila melanogaster

It is hypothesized that predictive power in this population stems from the SNP–based modeling of the subtle relationship structure caused by long-range linkage disequilibrium and not from population structure or SNPs in linkage diseqilibrium with causal variants.

Mixed linear model approach adapted for genome-wide association studies

A compression approach is reported, called 'compressed MLM', that decreases the effective sample size of such datasets by clustering individuals into groups and a complementary approach, 'population parameters previously determined' (P3D), that eliminates the need to re-compute variance components.

Variance component model to account for sample structure in genome-wide association studies

A variance component approach implemented in publicly available software, EMMA eXpedited (EMMAX), that reduces the computational time for analyzing large GWAS data sets from years to hours is reported.

PLINK: a tool set for whole-genome association and population-based linkage analyses.

This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.