Neighbor Joining Algorithms for Inferring Phylogenies via LCA Distances

  title={Neighbor Joining Algorithms for Inferring Phylogenies via LCA Distances},
  author={Ilan Gronau and Shlomo Moran},
  journal={Journal of computational biology : a journal of computational molecular cell biology},
  volume={14 1},
  • Ilan GronauS. Moran
  • Published 24 March 2007
  • Computer Science
  • Journal of computational biology : a journal of computational molecular cell biology
Reconstructing phylogenetic trees efficiently and accurately from distance estimates is an ongoing challenge in computational biology from both practical and theoretical considerations. We study algorithms which are based on a characterization of edge-weighted trees by distances to LCAs (Least Common Ancestors). This characterization enables a direct application of ultrametric reconstruction techniques to trees which are not necessarily ultrametric. A simple and natural neighbor joining… 

Figures and Tables from this paper

Fast Algorithms for Large-Scale Phylogenetic Reconstruction

Three novel fast phylogenetic algorithms are developed and LSHTree, the first sub-quadratic time algorithm with theoretical performance guarantees under a Markov model of sequence evolution, is applied to the problem of placing large numbers of short sequence reads onto a fixed phylogenetic tree.

Distance-Based Phylogeny Reconstruction: Safety and Edge Radius

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not, for teaching and research institutions in France or abroad, or from public or private research centers.

Two new distance based methods for phylogenetic tree reconstruction

  • Yong-Jun MaZuguo Yu
  • Biology
    2011 4th International Conference on Biomedical Engineering and Informatics (BMEI)
  • 2011
The results show that the DS method and mDLCA method perform same well as the NJ method, and are better than UPGMA and DLCA methods.

Fast and reliable reconstruction of phylogenetic trees with very short edges

This paper presents a fast converging reconstruction algorithm which returns a partially resolved topology containing all edges of the original tree whose weight exceeds some (non-trivial) lower bound, which is determined by the input sequence length, as well as some properties of the tree, such as its depth.

Computational Problems in Evolution Multiple Alignment

A new distance-based tree reconstruction method with optimal reconstruction radius and optimal runtime complexity, and an algorithm for computing the number of mutational events between aligned DNA sequences which is several hundred times faster than the famous Phylip packages.

Towards optimal distance functions for stochastic substitution models.

Identifying and Reconstructing Lateral Transfers from Distance Matrices by Combining the Minimum Contradiction Method and Neighbor-Net

A new approach is presented to deal with lateral gene transfers that combines the Neighbor-Net algorithm for computing phylogenetic networks with the Minimum Contradiction method, and is illustrated by applying it to a distance matrix for Archaea, Bacteria, and Eukaryota.

On the hardness of inferring phylogenies from triplet-dissimilarities

Fixed-parameter algorithms for some combinatorial problems in bioinformatics

This thesis applies fixed-parameter algorithms to cope with three NP-hard problems in bioinformatics: Flip Consensus Tree Problem, Bond Order Assignment Problem and Weighted Cluster Editing Problem, which arises in computational biology when clustering objects with respect to a given similarity or distance measure.

Inference for Large Tree-structured Data

A novel approach with tree-based representations of magnetic resonance images is used wherein the developed tests are employed to ascertain tumour heterogeneity between two groups of patients.



Fast neighbor joining

Faster reliable phylogenetic analysis

Fast new algorithms for phylogenetic reconstruction from distance data or weighted quartets are presented, and an attractive duality between unrooted trees, splits, and dissimilarities on one hand, and rooted trees, clusters, and similarity measures on the other is introduced.

Improvement of distance-based phylogenetic methods by a local maximum likelihood approach using triplets.

A new approach to estimate the evolutionary distance between two sequences using a tree with three leaves, which improves the precision of evolutionary distance estimates, and thus the topological accuracy of distance-based methods.

Fast and Accurate Phylogeny Reconstruction Algorithms Based on the Minimum-Evolution Principle

A greedy approach to minimum evolution which produces a starting topology in O(n(2)) time and yields a very significant improvement over NJ and other distance-based algorithms, especially with large trees, in terms of topological accuracy.

Theoretical foundation of the balanced minimum evolution method of phylogenetic inference and its relationship to weighted least-squares tree fitting.

It is proved that the BME principle is a special case of the weighted least-squares approach, with biologically meaningful variances of the distance estimates, and it is demonstrated that FASTME only produces trees with positive branch lengths, a feature that separates this approach from NJ (and related methods) that may produce trees with branches with biologically meaningless negative lengths.

Shortest triplet clustering: reconstructing large phylogenies using representative sets

This work proposes a new distance-based clustering method, the shortest triplet clustering algorithm (STC), to reconstruct phylogenies and introduces a natural definition of so-called k-representative sets, which serve as building blocks for the STC algorithm to agglomerate sequences for tree reconstruction in O(n) time for n sequences.

NJML: a hybrid algorithm for the neighbor-joining and maximum-likelihood methods.

  • S. OtaW. Li
  • Biology
    Molecular biology and evolution
  • 2000
A "divide-and-conquer" heuristic algorithm in which an initial neighbor-joining (NJ) tree is divided into subtrees at internal branches having bootstrap values higher than a threshold, which is suitable for reconstructing relatively large molecular phylogenetic trees.

Beyond pairwise distances: neighbor-joining with phylogenetic diversity estimates.

A generalization of the neighbor- joining transformation is presented, which uses estimates of phylogenetic diversity rather than pairwise distances in the tree, which leads to an improved neighbor-joining algorithm whose total running time is still polynomial in the number of taxa.

Maximal Accurate Forests from Distance Matrices

This work presents a fast converging method for distance-based phylogenetic inference, which is novel in two respects: first, it is the only method to guarantee accuracy when knowledge about the model tree, i.e bounds on the edge lengths, is not assumed; and, with high probability, no false assertions are made.

A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates.

Parsimony and compatibility had particular difficulty with inaccuracy and bias when substitution rates varied among different branches, and maximum likelihood was the most successful method overall, although for short sequences Fitch-Margoliash and neighbor joining were sometimes better.