Learn More
Phylogenomic analyses of large sets of genes or proteins have the potential to revolutionize our understanding of the tree of life. However, problems arise because estimated phylogenies from individual loci often differ because of different histories, systematic bias, or stochastic error. We have developed Concaterpillar, a hierarchical clustering method(More)
Microsporidia branch at the base of eukaryotic phylogenies inferred from translation elongation factor 1alpha (EF-1alpha) sequences. Because these parasitic eukaryotes are fungi (or close relatives of fungi), it is widely accepted that fast-evolving microsporidian sequences are artifactually "attracted" to the long branch leading to the archaebacterial(More)
It has long been recognized that the rates of molecular evolution vary amongst sites in proteins. The usual model for rate heterogeneity assumes independent rate variation according to a rate distribution. In such models the rate at a site, although random, is assumed fixed throughout the evolutionary tree. Recent work by several groups has suggested that(More)
The covarion hypothesis of molecular evolution proposes that selective pressures on an amino acid or nucleotide site change through time, thus causing changes of evolutionary rate along the edges of a phylogenetic tree. Several kinds of Markov models for the covarion process have been proposed. One model, proposed by Huelsenbeck (2002), has 2 substitution(More)
BACKGROUND Widely used substitution models for proteins, such as the Jones-Taylor-Thornton (JTT) or Whelan and Goldman (WAG) models, are based on empirical amino acid interchange matrices estimated from databases of protein alignments that incorporate the average amino acid frequencies of the data set under examination (e.g JTT + F). Variation in the(More)
Previous work has shown that it is often essential to account for the variation in rates at different sites in phylogenetic models in order to avoid phylogenetic artifacts such as long branch attraction. In most current models, the gamma distribution is used for the rates-across-sites distributions and is implemented as an equal-probability discrete gamma.(More)
MOTIVATION Expressed sequence tag (EST) surveys are an efficient way to characterize large numbers of genes from an organism. The rate of gene discovery in an EST survey depends on the degree of redundancy of the cDNA libraries from which sequences are obtained. However, few statistical methods have been developed to assess and compare redundancies of(More)
MOTIVATION To understand the evolution of molecular function within protein families, it is important to identify those amino acid residues responsible for functional divergence; i.e. those sites in a protein family that affect cofactor, protein or substrate binding preferences; affinity; catalysis; flexibility; or folding. Type I functional divergence (FD)(More)
BACKGROUND Phylogenetic reconstruction methods based on gene content often place all the parasitic and endosymbiotic eubacteria (parasites for short) together in a clan. Many other lines of evidence point to this parasites clan being an artefact. This artefact could be a consequence of the methods used to construct ortholog databases (due to some unknown(More)
BACKGROUND The covarion hypothesis of molecular evolution holds that selective pressures on a given amino acid or nucleotide site are dependent on the identity of other sites in the molecule that change throughout time, resulting in changes of evolutionary rates of sites along the branches of a phylogenetic tree. At the sequence level, covarion-like(More)