Accurate prediction of orthologs in the presence of divergence after duplication

@article{Lafond2018AccuratePO,
  title={Accurate prediction of orthologs in the presence of divergence after duplication},
  author={Manuel Lafond and Mona Meghdari Miardan and David Sankoff},
  journal={Bioinformatics},
  year={2018},
  volume={34},
  pages={i366 - i375}
}
Motivation When gene duplication occurs, one of the copies may become free of selective pressure and evolve at an accelerated pace. This has important consequences on the prediction of orthology relationships, since two orthologous genes separated by divergence after duplication may differ in both sequence and function. In this work, we make the distinction between the primary orthologs, which have not been affected by accelerated mutation rates on their evolutionary path, and the secondary… 

Figures and Tables from this paper

Performance of a phylogenetic independent contrast method and an improved pairwise comparison under different scenarios of trait evolution after speciation and duplication
TLDR
This work investigates under what reasonable evolutionary scenarios phylogenetic independent contrasts or pairwise comparisons can recover a putative signal of different functional evolution between orthologs and paralogs, and recommends methodological pluralism in studying gene family evolution.
Consistency of orthology and paralogy constraints in the presence of gene transfers
TLDR
It is shown that deciding if a relation graph $R$ is consistent with a given species network $N$ is NP-hard, and that it is W[1]-hard under the parameter "minimum number of transfers".
Phylostat: a web-based tool to analyze paralogous clade divergence in phylogenetic trees
TLDR
Here, Phylostat is a web-based tool built on phylo.io to allow comparative clade divergence analysis, which is available at https://phylost at.adebalilab.org under an MIT open-source licence.
FastMulRFS: Statistically consistent polynomial time species tree estimation under gene duplication
TLDR
It is proved that FastMulRFS is polynomial time and statistically consistent under a generic model of gene duplication and loss provided that only duplications occur or only losses occur, and that it matches the accuracy of MulRF and has better accuracy than ASTRAL-multi.
OrthoFinder2: fast and accurate phylogenomic orthology analysis from gene sequences
TLDR
Ortholog inference has fundamental importance across the biological sciences, underpinning phylogenetics, comparative genomics and prediction of gene function, and OrthoFinder achieves higher ortholog recall than all current methods as assessed by community-standard benchmarks.
OrthoFinder: phylogenetic orthology inference for comparative genomics
TLDR
This extends OrthoFinder’s high accuracy orthogroup inference to provide phylogenetic inference of orthologs, rooted gene trees, gene duplication events, the rooted species tree, and comparative genomics statistics.
FastMulRFS: fast and accurate species tree estimation under generic gene duplication and loss models
TLDR
This work proves that FastMulRFS is statistically consistent under a generic model of GDL when adversarial GDL does not occur, and shows that it matches the accuracy of MulRF and has better accuracy than prior methods, including ASTRAL-multi.
Comparative study of the SBP-box gene family in rice siblings
TLDR
A comparative study of SBP-box genes in the genomes of rice and its nine siblings using a recently proposed hybrid method for orthology and paralogy detection (HyPPO) shows close correspondence in exon–intron structure and motif conservation.
CoreCruncher: Fast and Robust Construction of Core Genomes in Large Prokaryotic Data Sets
TLDR
Although it is much faster than current methods, the results indicate that the approach is more conservative than other tools and less sensitive to the presence of paralogs and xenologs.
Evolutionary divergence of function and expression of laccase genes in plants
TLDR
Functional divergence analysis reveal that functional differentiation should occur among different groups of LACs because of altered selective constraints working on some critical amino acid sites (CAASs) within conserved laccase domains during evolution.
...
...

References

SHOWING 1-10 OF 53 REFERENCES
Duplicated genes evolve slower than singletons despite the initial rate increase
TLDR
The evolutionary trajectory of duplicated genes appears to be determined by two opposing trends, namely, the post-duplication rate acceleration and the generally slow evolutionary rate owing to the high level of functional constraints.
Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs
TLDR
It is reported here that a comparison of experimentally supported functional annotations among homologs from 13 genomes mostly supports the “ortholog conjecture”, and it is observed that orthologs have generally more similar functional annotations than paralogs.
Computational methods for Gene Orthology inference
TLDR
Comparisons of tree-based, sequence similarity- and synteny-based approaches can be combined into flexible hybrid methods show that, despite conceptual differences, they produce similar sets of orthologs, especially at short evolutionary distances.
Integrating Sequence Evolution into Probabilistic Orthology Analysis.
TLDR
DLRSOrthology is proposed, a sound, comprehensive Bayesian Markov chain Monte Carlo-based method that efficiently sums over the possible gene trees and jointly takes into account the current gene tree, all possible reconciliations to the species tree, and the, typically strong, signal conveyed by the sequences.
Orthology prediction at scalable resolution by phylogenetic tree analysis
TLDR
A benchmark for orthology prediction, that takes into account the varying levels of orthology between genes, shows that the phylogeny-based high-resolution orthology assignments made by LOFT are reliable.
COCO-CL: hierarchical clustering of homology relations based on evolutionary correlations
TLDR
COCO-CL can be used as a semi-independent method to delineate the orthology/paralogy relation for a refined set of homologous proteins obtained using a less-conservative clustering approach, or as a refiner that removes putative out-paralogs from clusters computed using a more inclusive approach.
Orthologs, Paralogs, and Evolutionary Genomics 1
TLDR
This review examines in depth the definitions and subtypes of orthologs and paralogs, outlines the principal methodological approaches employed for identification of orthology and paralogy, and considers evolutionary and functional implications of these concepts.
Inferring orthology and paralogy.
TLDR
This chapter provides an overview of the methods used to infer orthology and paralogy, and surveys both graph-based approaches (and their various grouping strategies) and tree- based approaches, which solve the more general problem of gene/species tree reconciliation.
Testing the Ortholog Conjecture with Comparative Functional Genomic Data from Mammals
TLDR
It is concluded that the most important factor in the evolution of function is not amino acid sequence, but rather the cellular context in which proteins act, and shed light on the relationship between sequence divergence and functional divergence.
...
...