A new method for handling missing species in diversification analysis applicable to randomly or nonrandomly sampled phylogenies.
A common pattern found in phylogeny-based empirical studies of diversification is a decrease in the rate of lineage accumulation toward the present. This early-burst pattern of cladogenesis is often interpreted as a signal of adaptive radiation or density-dependent processes of diversification. However, incomplete taxonomic sampling is also known to artifactually produce patterns of rapid initial diversification. The Monte Carlo constant rates (MCCR) test, based upon Pybus and Harvey's gamma (γ)-statistic, is commonly used to accommodate incomplete sampling, but this test assumes that missing taxa have been randomly pruned from the phylogeny. Here we use simulations to show that preferentially sampling disparate lineages within a clade can produce severely inflated type-I error rates of the MCCR test, especially when taxon sampling drops below 75%. We first propose two corrections for the standard MCCR test, the proportionally deeper splits that assumes missing taxa are more likely to be recently diverged, and the deepest splits only MCCR that assumes that all missing taxa are the youngest lineages in the clade, and assess their statistical properties. We then extend these two tests into a generalized form that allows the degree of nonrandom sampling (NRS)to be controlled by a scaling parameter, α. This generalized test is then applied to two recent studies. This new test allows systematists to account for nonrandom taxonomic sampling when assessing temporal patterns of lineage diversification in empirical trees. Given the dramatic affect NRS can have on the behavior of the MCCR test, we argue that evaluating the sensitivity of this test to NRS should become the norm when investigating patterns of cladogenesis in incompletely sampled phylogenies.