#### Abstract

Likelihood ratio tests (LRTs) for comparing models of sequence evolution have become popular over the last few years (Goldman 1993; Yang, Goldman and Friday 1994, 1995; Huelsenbeck and Crandall 1997; Huelsenbeck and Rannala 1997). In their simplest form, such tests compare a simpler null hypothesis (H0) with a more complex alternative hypotheses (H1) which is a generalization of H0. H0 can be derived from H1 by fixing one or more of its free parameters at particular values, and the hypotheses are described as nested. Although it is also possible to test non-nested models (Goldman 1993), nested models are often preferred, as statistical tests are simpler to perform and their results can be easier to interpret. The test statistic for an LRT can be written as 2 2 where and ˆ ˆ ˆ ˆ ˆ ln(L /L ) 2(ln(L ) ln(L )), L H H H H H 1 0 1 0 0 are the maximum-likelihood (ML) scores under hyL̂H1 potheses H0 and H1, respectively. This statistic measures how much improvement H1 gives over H0, and when the hypotheses are nested, 2 will always be nonnegative. For these nested hypotheses, and under certain regularity conditions, the asymptotic distribution of 2 (i.e., for large amounts of data) will be . Here, k is the 2 k number of degrees of freedom by which H0 and H1 differ, that is, the number of free parameters of H1 whose values must be fixed to derive H0 (Wald 1949; Silvey 1975; Felsenstein 1981; Goldman 1993; Yang, Goldman, and Friday 1994, 1995). (In effect, each free parameter contributes a variate to the distribution of 2 , 2 1 with the sum of k independent variates being distrib2 1 uted as ) Statistical tests assessed using such 2 dis2 . k tributions have now become a widespread and useful tool in phylogenetics (Huelsenbeck and Crandall 1997; Huelsenbeck and Rannala 1997). Recently, there has been renewed interest in testing whether the predicted 2 distribution gives a reliable estimate of the true distribution of 2 under realistic conditions (e.g., with finite sequence lengths). Whelan and Goldman (1999) investigated cases in which the competing hypotheses were different models of nucleotide substitution. Under three specimen experimental designs (representing realistic phylogenies and nucleotide substitution processes), we found that the 2 distribution was acceptable for performing tests of the significance of parameters describing the relative rate of transition