FUBAR: a fast, unconstrained bayesian approximation for inferring selection.

  title={FUBAR: a fast, unconstrained bayesian approximation for inferring selection.},
  author={B. Murrell and Sasha Moola and Amandla Mabona and Thomas Weighill and Daniel J. Sheward and Sergei L. Kosakovsky Pond and Konrad Scheffler},
  journal={Molecular biology and evolution},
  volume={30 5},
Model-based analyses of natural selection often categorize sites into a relatively small number of site classes. Forcing each site to belong to one of these classes places unrealistic constraints on the distribution of selection parameters, which can result in misleading inference due to model misspecification. We present an approximate hierarchical Bayesian method using a Markov chain Monte Carlo (MCMC) routine that ensures robustness against model misspecification by averaging over a large… 

Contrast-FEL—A Test for Differences in Selective Pressures at Individual Sites among Clades and Sets of Branches

A simple extension of a popular fixed effects likelihood method in the context of codon-based evolutionary phylogenetic maximum likelihood testing, Contrast-FEL, suitable for identifying individual alignment sites where any among the K ≥ 2 sets of branches in a phylogenetic tree have detectably different dN/dS ratios, indicative of different selective regimes.

Gene-wide identification of episodic selection.

A new approach to identifying gene-wide evidence of episodic positive selection, where the non-synonymous substitution rate is transiently greater than the synonymous rate, and a computationally inexpensive evidence metric for identifying sites subject to episodicpositive selection on any foreground branches.

Synonymous Site-to-Site Substitution Rate Variation Dramatically Inflates False Positive Rates of Selection Analyses: Ignore at Your Own Peril

It is found that failing to account for even moderate levels of SRV in selection testing is likely to produce intolerably high false positive rates and add to a growing literature establishing that tests of selection are much more sensitive to certain model assumptions than previously believed.

On the Validity of Evolutionary Models with Site-Specific Parameters

A simulation study is presented providing empirical evidence that a simple version of the models in question does exhibit sensible convergence behavior and that additional taxa, despite not being independent of each other, lead to improved parameter estimates.

A Bayesian Mutation–Selection Framework for Detecting Site-Specific Adaptive Evolution in Protein-Coding Genes

A Bayesian site-heterogeneous mutation–selection framework for site-specific detection of adaptive substitution regimes given a protein-coding DNA alignment is presented and it is suggested that the new approach shows greater sensitivity than traditional methods.

One-rate models outperform two-rate models in site-specific dN/dS estimation

It is found that one-rate inference models universally outperform two-rate models for estimating reliable site-specific dN/dS ratios and high levels of divergence among sequences are more critical for obtaining precise point estimates than the number of sequences in the alignment.

Identification of positive selection in genes is greatly improved by using experimentally informed site-specific models

A new approach is described that uses a null model based on experimental measurements of a gene’s site-specific amino-acid preferences generated by deep mutational scanning in the lab that identifies sites of adaptive substitutions in four genes far better than a comparable method that simply compares the rates of nonsynonymous and synonymous substitutions.

Limited utility of residue masking for positive-selection inference.

It is found that no filter, including original Guidance, consistently benefitted positive-selection inferences, and all improvements detected were exceedingly minimal, and in certain circumstances, Guidance-based filters worsened inferences.

The relationship between dN/dS and scaled selection coefficients.

Establishing mathematical links among modeling frameworks represents a novel, powerful strategy to pinpoint previously unrecognized model limitations and strengths.

“Balancing” balancing selection? Assortative mating at the major histocompatibility complex despite molecular signatures of balancing selection

It is suggested that in systems where individual fitness does not increase monotonically with MHC diversity, assortative mating may help to avoid excessive offspring heterozygosity that could otherwise arise from long‐standing balancing selection.



A random effects branch-site model for detecting episodic diversifying selection.

Felsenstein's pruning algorithm is extended to allow efficient likelihood computations for models in which variation over branches (and not just sites) is described in the random effects likelihood framework, and this model treats the selective class of every branch at a particular site as an unobserved state that is chosen independently of that at any other branch.

A Bayesian model comparison approach to inferring positive selection.

The Bayesian approach outperforms the empirical Bayes method when the amount of sequence divergence is small and is less prone to false-positive inference when the sequences are saturated, while the results are indistinguishable for intermediate levels of sequences divergence.

Conjugate Gibbs Sampling for Bayesian Phylogenetic Models

The conjugate Gibbs formalism allows one to propose efficient implementations of complex models, for instance assuming site-specific substitution processes, that would not be accessible to standard MCMC methods.

Uniformization for sampling realizations of Markov processes: applications to Bayesian implementations of codon substitution models

A general method, based on a uniformization technique, which can be utilized to generate realizations of a Markovian substitution process conditional on an alignment of character states and a given tree topology is described.

Taking Variation of Evolutionary Rates Between Sites into Account in Inferring Phylogenies

A model based on population genetics is presented predicting how the rates of evolution might vary from locus to locus, and Markov chain Monte Carlo likelihood methods may be the only practical way to carry out computations for these models.

Not so different after all: a comparison of methods for detecting amino acid sites under selection.

Three approaches for estimating the rates of nonsynonymous and synonymous changes at each site in a sequence alignment in order to identify sites under positive or negative selection are considered, suggesting that previously reported differences between results obtained by counting methods and random effects models arise due to a combination of the conservative nature of counting-based methods, the failure of current random effect models to allow for variation in synonymous substitution rates, and the naive application ofrandom effects models to extremely sparse data sets.

Maximum Likelihood Estimation on Large Phylogenies and Analysis of Adaptive Evolution in Human Influenza Virus A

Methods for obtaining approximate estimates of branch lengths for codon models are explored and the estimates were used to test for positive selection and to identify sites under selection in the viral gene under diversifying Darwinian selection.

A Dirichlet process model for detecting positive selection in protein-coding DNA sequences.

This work describes an approach to modeling variation in the nonsynonymous rate of substitution by using a Dirichlet process mixture model, which allows there to be a countably infinite number of nonsynonym rate classes and is very flexible in accommodating different potential distributions.

Detecting Amino Acid Sites Under Positive Selection and Purifying Selection

It is shown that the SLR method can be more powerful than currently published methods for detecting the location of positive selection, especially in difficult cases where the strength of selection is low.

Detecting Individual Sites Subject to Episodic Diversifying Selection

It is found that episodic selection is widespread and it is concluded that the number of sites experiencing positive selection may have been vastly underestimated.