GPU MrBayes V3.1: MrBayes on Graphics Processing Units for Protein Sequence Data.

  title={GPU MrBayes V3.1: MrBayes on Graphics Processing Units for Protein Sequence Data.},
  author={Shuai Pang and Rebecca J. Stones and Mingming Ren and Xiao-guang Liu and Gang Wang and Hong-ju Xia and Hao-yang Wu and Yang Liu and Qiang Xie},
  journal={Molecular biology and evolution},
  volume={32 9},
We present a modified GPU (graphics processing unit) version of MrBayes, called ta(MC)(3) (GPU MrBayes V3.1), for Bayesian phylogenetic inference on protein data sets. Our main contributions are 1) utilizing 64-bit variables, thereby enabling ta(MC)(3) to process larger data sets than MrBayes; and 2) to use Kahan summation to improve accuracy, convergence rates, and consequently runtime. Versus the current fastest software, we achieve a speedup of up to around 2.5 (and up to around 90 vs… 

Tables from this paper

MrBayes for Phylogenetic Inference Using Protein Data on a GPU Cluster

A new task mapping strategy, the use of Kahan summation to resolve non-convergence issues, and the introduction of 64-bit variables are presented, which, for protein datasets, improves computational efficiency and overcomes major obstacles in analyzing larger datasets on HPCs with multiple Graphics Processing Units (GPUs).

A Three-Level Parallel Algorithm for MrBayes 3.2

  • M. ZhaoQiang Ren X. Liu
  • Computer Science
    2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC)
  • 2017
A new three-level hybrid parallel algorithm that can be used on most modern multi-core computers, include data-level parallelism (DLP), thread-levelallelism (TLP), and process-level Parallelism (PLP), which can beused on real-world protein data sets.

GPU Parallelism of Phylogenetic Likelihood Estimates for Protein Data

  • Yichan LiJingyang GaoCheng LingHaoyu Zhang
  • Computer Science
    2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
  • 2018
The proposed method, tgpMC3, achieves a peak speedup ratio of 117× by two NVIDIA Tesla K20 GPU cards on the Tianhe-1A supercomputer's GPU nodes, and outperforms another state-of-the-art GPU method for the analysis of protein sequences.

CuPhylo: A CUDA Based Application Program Interface and Library for Phylogenetic Analysis

  • Mingming RenXiaomin HuangYuyang GaoGang WangXiaoguang Liu
  • Biology
    2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)
  • 2019
Experimental results indicate that CuPhylo outperforms another phylogenetic likelihood computation library BEAGLE on large-scale data sets and is presented as a CUDA based computing Application Program Interface (API) and library for the likelihood computation in phylogenetic analysis.

Efficient ResNet Model to Predict Protein-Protein Interactions With GPU Computing

This paper proposes an efficient algorithm based on the residual network (ResNet) model to predict PPI (ResPPI), which uses the embedding method to represent amino acid sequences, combining the advantages of powerful feature extraction capabilities of the ResNet with deep layers and GPU performance.

CALANGO: an annotation-based, phylogeny-aware comparative genomics framework for exploring and interpreting complex genotypes and phenotypes

CALANGO (Comparative AnaLysis with ANnotation-based Genomic cOmponentes), a first-principles comparative genomics tool to search for annotation terms associated with a quantitative variable used to rank species data, demonstrates how GO-based annotation captures information of non-homologous sequences fulfilling the same biological roles.

Complete Mitogenomes of Three Carangidae (Perciformes) Fishes: Genome Description and Phylogenetic Considerations

The phylogenetic tree based on PCGs sequences of mitogenomes using maximum likelihood and Bayesian inference analyses showed that three clades were divided corresponding to the subfamilies Caranginee, Naucratinae, and Trachinotinae.

Genetic and morphological analyses uncover a new record and a cryptic species in Allonais (Clitellata: Naididae)

It is suggested that the Chinese morphotype more resembles A. inaequalis from India, where the species was first described, whereas the species in Peru is likely to be another species.

Comparative mitogenome analyses uncover mitogenome features and phylogenetic implications of the subfamily Cobitinae

This study sequenced and analyzed the complete mitogenomes of a female Cobits macrostigma and conducted the first comparative mitogenomic and phylogenetic analyses within Cobitinae, providing new insights into the mitogenome features and evolution of fishes belonging to the cobitinee family.

CALANGO: a phylogeny-aware comparative genomics tool for discovering quantitative genotype-phenotype associations across species

CALANGO is presented, a phylogeny-aware comparative genomics tool that uncovers functional molecular convergences and homologous regions associated with quantitative genotypes and phenotypes across species, enabling the fast discovery of novel statistically sound, biologically relevant phenotype-genotype associations.



Efficient Implementation of MrBayes on Multi-GPU

A new “node-by-node” task scheduling strategy is developed to improve concurrency, and several optimizing methods are used to reduce extra overhead.

MrBayes on a Graphics Processing Unit

Improvements are implemented on the NVIDIA GeForce GTX 480 as most other GPUs are unsuitable for running MrBayes MC³ due to a range of reasons, such as having insufficient support for double precision floating-point arithmetic.

MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space

The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly, and provides more output options than previously, including samples of ancestral states, site rates, site dN/dS rations, branch rates, and node dates.

Arthropod relationships revealed by phylogenomic analysis of nuclear protein-coding sequences

This work presents strongly supported results from likelihood, Bayesian and parsimony analyses of over 41 kilobases of aligned DNA sequence from 62 single-copy nuclear protein-coding genes from 75 arthropod species, providing a statistically well-supported phylogenetic framework for the largest animal phylum.

MRBAYES: Bayesian inference of phylogenetic trees

The program MRBAYES performs Bayesian inference of phylogeny using a variant of Markov chain Monte Carlo, and an executable is available at

On the phylogenetic position of Myzostomida: can 77 genes get it wrong?

It is concluded that reliance of a set of markers belonging to a single class of macromolecular complexes might bias the analysis, and that concatenation of all available data might introduce conflicting signal into phylogenetic analyses.

Phylogenetic position of Nemertea derived from phylogenomic data.

Nemertea and Platyhelminthes have traditionally been grouped together because they possess a so-called acoelomate organization, but lateral vessels and rhynchocoel of nemerteans have been regarded as coelomic cavities, where Nemertea is most likely due to a secondary reduction of the coelom as it is found in certain species of Mollusca and Annelida.

A phylogenomic approach to resolve the basal pterygote divergence.

The comprehensive molecular data set developed here provides conclusive support for odonates as the most basal winged insect order (Chiastomyaria hypothesis) and data quality assessment indicates that proteins involved in cellular processes and signaling harbor the most informative phylogenetic signal.

Phylogenetic position of Sipuncula derived from multi‐gene and phylogenomic data and its implication for the evolution of segmentation

It is revealed that Sipuncula had secondarily lost segmentation, which does not support the hypothesis that the last common ancestor of Annelida, Arthropoda and Chordata was segmented, assuming several losses along the branches leading to them.

Phylogenomic analyses of lophophorates (brachiopods, phoronids and bryozoans) confirm the Lophotrochozoa concept

These analyses show that the three lophophorate lineages are affiliated with trochozoan rather than deuterostome phyla, and all hypotheses claiming that they are more closely related to Deuterstomia than to Protostomia can be rejected by topology testing.