PARSIMONY JACKKNIFING OUTPERFORMS NEIGHBOR‐JOINING

@article{Farris1996PARSIMONYJO,
  title={PARSIMONY JACKKNIFING OUTPERFORMS NEIGHBOR‐JOINING},
  author={James S Farris and Victor A. Albert and Mari K{\"a}llersj{\"o} and Diana L. Lipscomb and Arnold G. Kluge},
  journal={Cladistics},
  year={1996},
  volume={12}
}
Abstract— Because they are designed to produced just one tree, neighbor‐joining programs can obscure ambiguities in data. Ambiguities can be uncovered by resampling, but existing neighbor‐joining programs may give misleading bootstrap frequencies because they do not suppress zero‐length branches and/or are sensitive to the order of terminals in the data. A new procedure, parsimony jackknifing, overcomes these problems while running hundreds of times faster than existing programs for neighbor… 
The future of phylogeny reconstruction
TLDR
Parsimony jackknifing uses simple parsimony calculations combined with resampling of characters to arrive at a tree comprising well‐supported groups, allowing analysis of much larger data matrices, and also provides information on the strength of support for different groups.
Branch Lengths Do Not Indicate Support—Even in Maximum Likelihood
It is still common to see the branch lengths of “phylograms”1 interpreted as indicating support for groups. This is unfortunate, for it is easy to find cases in which long branches do not indicate
Parsimony analysis of phylogenomic datasets (II): evaluation of PAUP*, MEGA and MPBoot
TLDR
This paper examines the implementation of parsimony methods in the programs PAUP*, MEGA and MPBoot, and compares them with TNT, and finds that bootstrapping with PAUP, MEGA or MPBoot can attribute strong supports to groups that have no support at all under any meaningful concept of support, such as likelihood ratios or Bremer supports.
Analyzing Large Data Sets in Reasonable Times: Solutions for Composite Optima
TLDR
New methods for parsimony analysis of large data sets are presented, including sectorial searches, tree‐drifting, and tree‐fusing which find a shortest tree in less than 10 min and perform well in other cases analyzed.
Simple phylogenetic tree searches easily "succeed" with large matrices of single genes
TLDR
It is shown with both extensive real and simulated data that rigorous and time-intensive approaches to reconstructing large phylogenetic trees are unwarranted with small amounts of data because they actually produce trees with scores that are shorter or otherwise less optimal than the model tree or trees produced with larger amounts ofData.
Misleading results of likelihood‐based phylogenetic analyses in the presence of missing data
TLDR
This study uses contrived and simulated examples to demonstrate that likelihood, even when applied to simple matrices with little or no homoplasy, homogeneous evolution across groups of characters, perfect model fit, and hundreds or thousands of variable characters, can provide strong support for incorrect topologies when the matrices have non‐random distributions of missing data distributed across all partitions.
Why Neighbor-Joining Works
Abstract We show that the neighbor-joining algorithm is a robust quartet method for constructing trees from distances. This leads to a new performance guarantee that contains Atteson’s optimal radius
Branch support via resampling: an empirical study
TLDR
Two datasets were explored for a range of search parameters using jackknifing, and strict consensus summary of resampling replicates is preferable to frequency‐within‐replicates summary because it is a more conservative approach to the reporting of replicate results.
Support Weighting
Previous weighting methods—including compatibility weighting—have assumed that homoplasy indicates unreliability, but this assumption does not seem to hold for large molecular data matrices.
Computer science and parsimony: a reappraisal, with discussion of methods for poorly structured datasets
  • P. Goloboff
  • Biology
    Cladistics : the international journal of the Willi Hennig Society
  • 2015
TLDR
This contribution discusses new heuristic methods for parsimony analysis, including methods highly praised by their authors, such as Hydra, Sampars and GA + PR + LS.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 35 REFERENCES
SKEWNESS AND PERMUTATION
TLDR
The skewness criterion of phylogenetic structure in data is too sensitive to character state frequencies, is not sensitive enough to number of characters, and relies on counts of arbitrarily‐resolved bifurcating trees to give misleading results.
ON MISSING ENTRIES IN CLADISTIC ANALYSIS
TLDR
The exact algorithms of two commonly used parsimony programs, Hennig86 and PAUP, sometimes produce different solutions, and sometimes produce resolutions that are not supported by the data being analysed, causing discrepancies in the treatment of missing entries.
Multiple UPGMA and Neighbor-joining Trees and the Performance of Some Computer Packages
TLDR
It is shown that multiple UPGMA and NJ trees cannot be neglected with molecular data based on allozyme distances or “binary” distances derived from random amplified polymorphic DNA, restriction fragments, DNA fingerprints, or general protein patterns, and observed that NTSYS, PHYLIP MVSP and MVSP87 have different efficiencies in finding ties.
CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP
  • J. Felsenstein
  • Economics
    Evolution; international journal of organic evolution
  • 1985
TLDR
The recently‐developed statistical method known as the “bootstrap” can be used to place confidence intervals on phylogenies and shows significant evidence for a group if it is defined by three or more characters.
The neighbor-joining method: a new method for reconstructing phylogenetic trees.
TLDR
The neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods for reconstructing phylogenetic trees from evolutionary distance data.
Estimating Phylogenetic Trees from Distance Matrices
TLDR
The distance Wagner procedure is applicable to data matrices of immunological distance, such as that of Sarich (1969a), in which between-OTU comparisons are evaluated but for which no attributes of the OTUs themselves are directly observable.
Relative efficiencies of the maximum parsimony and distance-matrix methods in obtaining the correct phylogenetic tree.
TLDR
The relative efficiencies of the maximum parsimony (MP) and distance-matrix methods in obtaining the correct tree (topology) were studied by using computer simulation, indicating that when the number of nucleotide substitutions per site is small and a relatively small number ofucleotides are used, the probability of obtaining thecorrect topology (P1) is generally lower in the MP method than in the distance-Matrix methods.
Relative Efficiencies of the Fitch-Margoliash, Maximum-Parsimony, Maximum-Likelihood, Minimum-Evolution, and Neighbor-joining Methods of Phylogenetic Tree Construction in Obtaining the Correct Tree
TLDR
The relative efficiencies of several tree-making methods for obtaining the correct phylogenetic tree were studied by using computer simulation, and the NJ method seems to be a method of choice.
UNINFORMATIVE BOOTSTRAPPING
The effect of uninformative characters on the “significance levels” obtained by bootstrapping in cladistic analysis is investigated empirically. Twenty‐eight data sets from Platnick's benchmarks are
The effect of irrelevant characters on bootstrap values
1 E-mail: hars@midway.uchicago.edu. conveniently its inverse, relevance?can be defined recursively, at least for binary characters. A binary character is relevant to a node if any of these conditions
...
1
2
3
4
...