On the independent loci assumption in phylogenomic studies


24 Studies using multi-locus coalescent methods to infer species trees or historical demographic 25 parameters usually require the assumption that the gene tree for each locus (or SNP) is 26 genealogically independent from the gene trees of other sampled loci. In practice, however, 27 researchers have used two different criteria to delimit independent loci in phylogenomic studies. 28 The first criterion, which directly addresses the condition of genealogical independence of 29 sampled loci, considers the long-term effects of homologous recombination and effective 30 population size on linkage between two loci. In contrast, the second criterion, which only 31 considers the single-generation effects of recombination in the meioses of individuals, identifies 32 sampled loci as being independent of each other if they undergo Mendelian independent 33 assortment. Methods that use these criteria to estimate the number of independent loci per 34 genome as well as intra-chromosomal “distance thresholds” that can be used to delimit 35 independent loci in phylogenomic datasets are reviewed. To compare the efficacy of each 36 criterion, they are applied to two species (an invertebrate and vertebrate) for which relevant 37 genetic and genomic data are available. Although the independent assortment criterion is 38 relatively easy to apply, the results of this study show that it is overly conservative and therefore 39 its use would unfairly restrict the sizes of phylogenomic datasets. It is therefore recommended 40 that researchers only refer to genealogically independent loci when discussing the independent 41 loci assumption in phylogenomics and avoid using terms that may conflate this assumption with 42 independent assortment. Moreover, whenever feasible, researchers should use methods for 43 delimiting putatively independent loci that take into account both homologous recombination 44 and effective population size (i.e., long-term effective recombination). 45 46 . CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/066332 doi: bioRxiv preprint first posted online Jul. 28, 2016;

