Species Trees from Gene Trees Despite a High Rate of Lateral Genetic Transfer: A Tight Bound (Extended Abstract)

  title={Species Trees from Gene Trees Despite a High Rate of Lateral Genetic Transfer: A Tight Bound (Extended Abstract)},
  author={Constantinos Daskalakis and S{\'e}bastien Roch},
Reconstructing the tree of life from molecular sequences is a fundamental problem in computational biology. Modern data sets often contain a large number of genes which can complicate the reconstruction problem due to the fact that different genes may undergo different evolutionary histories. This is the case in particular in the presence of lateral genetic transfer (LGT), whereby a gene is inherited from a distant species rather than an immediate ancestor. Such an event produces a gene tree… 

Figures from this paper

Species Trees are Recoverable from Unrooted Gene Tree Topologies Under a Constant Rate of Horizontal Gene Transfer

It is shown that a species phylogeny can be reconstructed correctly from gene trees even when, on each gene, each edge of the species tree has a constant probability of being the location of an HGT event.

Phylogeny of dependencies and dependencies of phylogenies in genes and genomes

It is shown that not taking into account this inter-dependency relationships (co- evolutionary relationships) during the inference of gene trees results in an overestimation of the differences between gene trees as well as between gene tree and species tree.

Fast and Accurate Species Trees from Weighted Internode Distances

This study provides a new and very fast method for species tree estimation that improves upon ASTRID, has comparable accuracy with the state of the art while remaining much faster.

Inconsistency of Species Tree Methods under Gene Flow.

This work studies the performance of two of the most efficient coalescent-based methods, ASTRAL and NJst, in the presence of gene flow, and underline the need for methods like PhyloNet, to account simultaneously for ILS and gene flow in a unified framework.

FastMulRFS: Statistically consistent polynomial time species tree estimation under gene duplication

It is proved that FastMulRFS is polynomial time and statistically consistent under a generic model of gene duplication and loss provided that only duplications occur or only losses occur, and that it matches the accuracy of MulRF and has better accuracy than ASTRAL-multi.

Recent progress on methods for estimating and updating large phylogenies

New methods have been developed that aim to enable highly accurate phylogeny estimations on these large datasets, including divide-and-conquer techniques for multiple sequence alignment and/or tree estimation, methods that can estimate species trees from multi-locus datasets while addressing heterogeneity due to biological processes.

Polynomial-Time Statistical Estimation of Species Trees under Gene Duplication and Loss

It is shown that species trees are identifiable under a standard stochastic model for GDL, and that the polynomial-time algorithm ASTRal-multi, a recent development in the ASTRAL suite of methods, is statistically consistent under this GDL model.

In the light of deep coalescence: revisiting trees within networks

It is shown that in the presence of coalescence effects, the set of displayed trees is not sufficient to capture the network and can form the basis for achieving higher accuracy when inferring phylogenetic networks.

Computational Phylogenetics: An Introduction to Designing Methods for Phylogeny Estimation

The author provides key analytical techniques to prove theoretical properties about methods, as well as addressing performance in practice for methods for estimating trees, in the broad and exciting field of computational phylogenetics.

Research in Computational Molecular Biology: 24th Annual International Conference, RECOMB 2020, Padua, Italy, May 10–13, 2020, Proceedings

An algorithm solving the genomic distance problem for natural genomes, in which any marker may occur an arbitrary number of times, is presented, based on a new graph data structure, the multi-relational diagram, that allows an elegant extension of the ILP to count runs of markers that are underor over-represented in one genome with respect to the other and need to be inserted or deleted.



Recovering the Tree-Like Trend of Evolution Despite Extensive Lateral Genetic Transfer: A Probabilistic Analysis

Under a model of randomly distributed LGT, it is shown that the species phylogeny can be reconstructed even in the presence of surprisingly many (almost linear number of) LGT events per gene tree.

From Gene Trees to Species Trees

This paper studies various algorithmic issues in reconstructing a species tree from gene trees under the duplication and the mutation cost model and proposes a heuristic method that is significantly better than the existing program in Page's GeneTree 1.0 that starts the search from a random tree.

Gene Trees in Species Trees

When gene copies are sampled from various species, the gene tree relating these copies might disagree with the species phylogeny, and discord can arise from horizontal transfer, lineage sorting, and gene duplication and ex- tinction.

Optimal phylogenetic reconstruction

The proof of Steel's conjecture is complete and a reconstruction algorithm using optimal (up to a multiplicative constant) sequence length is given to obtain an optimal reconstruction algorithm for the Jukes-Cantor model with short edges.

A model of horizontal gene transfer and the bacterial phylogeny problem.

A Markov model of genome evolution with HGT is introduced, accounting for the constraints on time -- an HGT event can only occur between concomitantly living species and this model is used to simulate multigene sequence data sets with or without HGT.

A Tree Obscured By Vines: Horizontal Gene Transfer and the Median Tree Method of Estimating Species Phylogeny

A new phylogeny estimation method designed to estimate the species tree despite horizontal transfer using the idea that horizontal transfer distorts distance relationships between pairs of species but a median estimate of the distances is robust to such distortions.

Parsimony Score of Phylogenetic Networks: Hardness Results and a Linear-Time Heuristic

A novel combinatorial definition of phylogenetic networks in terms of Forbidden cycles,rdquo is provided and detailed hardness and hardness of approximation proofs for the "smallRDquo MP problem" are provided.

Stochastic Models for Horizontal Gene Transfer

A simple class of stochastic models are proposed to examine HGT using multiple orthologous gene alignments and the flexibility of these models is demonstrated to test competing ideas about HGT by examining the complexity hypothesis.

Do orthologous gene phylogenies really support tree-thinking?

It is concluded that phylogenetic analyses do not support tree-thinking, and it is argued that representations other than a tree should be investigated in this case because a non-critical concatenation of markers could be highly misleading.