Learn More
Previous analyses of relations, divergence times, and diversification patterns among extant mammalian families have relied on supertree methods and local molecular clocks. We constructed a molecular supermatrix for mammalian families and analyzed these data with likelihood-based methods and relaxed molecular clocks. Phylogenetic analyses resulted in a(More)
Phylogenetic trees are commonly reconstructed based on hard optimization problems such as maximum parsimony (MP) and maximum likelihood (ML). Conventional MP heuristics for producing phylogenetic trees produce good solutions within reasonable time on small datasets (up to a few thousand sequences), while ML heuristics are limited to smaller datasets (up to(More)
BACKGROUND MapReduce is a parallel framework that has been used effectively to design large-scale parallel applications for large computing clusters. In this paper, we evaluate the viability of the MapReduce framework for designing phylogenetic applications. The problem of interest is generating the all-to-all Robinson-Foulds distance matrix, which has many(More)
BACKGROUND Evolutionary trees are family trees that represent the relationships between a group of organisms. Phylogenetic heuristics are used to search stochastically for the best-scoring trees in tree space. Given that better tree scores are believed to be better approximations of the true phylogeny, traditional evaluation techniques have used tree scores(More)
Large and comprehensive phylogenetic trees are desirable for studying macroevolutionary processes and for classification purposes. One approach for obtaining large phylogenies is to combine the topologies (or source trees) from previous phylogenetic studies. Tree reconstruction techniques that use the above methodology are known as supertree methods. In(More)
In this paper, we study two fast algorithms—HashRF and PGM-Hashed—for computing the Robinson-Foulds (RF) distance matrix between a collection of evolutionary trees. The RF distance matrix represents a tremendous data-mining opportunity for helping biologists understand the evolutionary relationships depicted among their trees. The novelty of our work(More)
Many large-scale phylogenetic reconstruction methods attempt to solve hard optimization problems (such as Maximum Parsimony (MP) and Maximum Likelihood (ML)), but they are limited severely by the number of taxa that they can handle in a reasonable time frame. A standard heuristic approach to this problem is the divide-and-conquer strategy: decompose the(More)
Consensus trees are a popular approach for summarizing the shared evolutionary relationships in a collection of trees. Many popular techniques such as Bayesian analyses produce results that can contain tens of thousands of trees to summarize. We develop a fast consensus algorithm called HashCS to construct large-scale consensus trees. We perform an(More)
Trends in parallel computing indicate that heterogeneous parallel computing will be one of the most widespread platforms for computation-intensive applications. A heterogeneous computing environment ooers considerably more computational power at a lower cost than a parallel computer. We propose the Heterogeneous Bulk Synchronous Parallel (HBSP) model, which(More)