Frederick A. Matsen IV

Learn More
BACKGROUND Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal.(More)
Like all organisms on the planet, environmental microbes are subject to the forces of molecular evolution. Metagenomic sequencing provides a means to access the DNA sequence of uncultured microbes. By combining DNA sequencing of microbial communities with evolutionary modeling and phylogenetic analysis we might obtain new insights into microbiology and also(More)
BACKGROUND Bacterial vaginosis (BV) is a common condition that is associated with numerous adverse health outcomes and is characterized by poorly understood changes in the vaginal microbiota. We sought to describe the composition and diversity of the vaginal bacterial biota in women with BV using deep sequencing of the 16S rRNA gene coupled with(More)
It is now common to survey microbial communities by sequencing nucleic acid material extracted in bulk from a given environment. Comparative methods are needed that indicate the extent to which two communities differ given data sets of this type. UniFrac, which gives a somewhat ad hoc phylogenetics-based distance between two communities, is one of the most(More)
Heterochromatin is the gene-poor, satellite-rich eukaryotic genome compartment that supports many essential cellular processes. The functional diversity of proteins that bind and often epigenetically define heterochromatic DNA sequence reflects the diverse functions supported by this enigmatic genome compartment. Moreover, heterogeneous signatures of(More)
Simian foamy viruses (SFVs) are ubiquitous in non-human primates (NHPs). As in all retroviruses, reverse transcription of SFV leads to recombination and mutation. Because more humans have been shown to be infected with SFV than with any other simian borne virus, SFV is a potentially powerful model for studying the virology and epidemiology of viruses at the(More)
Classifying individual bacterial species comprising complex, polymicrobial patient specimens remains a challenge for culture-based and molecular microbiology techniques in common clinical use. We therefore adapted practices from metagenomics research to rapidly catalog the bacterial composition of clinical specimens directly from patients, without need for(More)
Principal components analysis (PCA) and hierarchical clustering are two of the most heavily used techniques for analyzing the differences between nucleic acid sequence samples taken from a given environment. They have led to many insights regarding the structure of microbial communities. We have developed two new complementary methods that leverage how this(More)
Recent statistical and computational analyses have shown that a genealogical most recent common ancestor (MRCA) may have lived in the recent past [Chang, J.T., 1999. Recent common ancestors of all present-day individuals. Adv. Appl. Probab. 31, 1002-1026. 1027-1038; Rohde, D.L.T., Olson, S., Chang, J.T., 2004. Modelling the recent common ancestry of all(More)
It is well known among phylogeneticists that adding an extra taxon (e.g. species) to a data set can alter the structure of the optimal phylogenetic tree in surprising ways. However, little is known about this "rogue taxon" effect. In this paper we characterize the behavior of balanced minimum evolution (BME) phylogenetics on data sets of this type using(More)