Sequence divergence, functional constraint, and selection in protein evolution.

  title={Sequence divergence, functional constraint, and selection in protein evolution.},
  author={Justin C. Fay and Chung-I Wu},
  journal={Annual review of genomics and human genetics},
  • J. FayChung-I Wu
  • Published 28 November 2003
  • Biology
  • Annual review of genomics and human genetics
The genome sequences of multiple species has enabled functional inferences from comparative genomics. A primary objective is to infer biological functions from the conservation of homologous DNA sequences between species. A second, more difficult, objective is to understand what functional DNA sequences have changed over time and are responsible for species' phenotypic differences. The neutral theory of molecular evolution provides a theoretical framework in which both objectives can be… 

Figures from this paper

Parallel Patterns of Evolution in the Genomes and Transcriptomes of Humans and Chimpanzees

It is found that genes active in brain have accumulated more changes on the human than on the chimpanzee lineage, and patterns suggestive of positive selection on sequence changes as well as expression changes are seen.

Functional Evolution of Proteins

The first functional clustering and evolutionary analysis of the RCSB Protein Data Bank (RCSB PDB) based on similarities between active‐site structures identified a sequential, step‐wise evolution of protein active‐sites and provides novel insights into the emergence of protein function or changes in substrate specificity based on subtle changes in geometry and amino acid composition.

Is gene duplication a viable explanation for the origination of biological information and complexity?

Although the process of gene duplication and subsequent random mutation has certainly contributed to the size and diversity of the genome, it is alone insufficient in explaining the origination of the highly complex information pertinent to the essential functioning of living organisms.

Evaluating the role of natural selection in the evolution of gene regulation

The various methods that have been used to test for signs of selection in genomic expression data are reviewed and properties of regulatory systems relevant to neutral models of gene expression are discussed.

Evolution of primate gene expression

A neutral model where negative selection and divergence time are the major factors is a useful null hypothesis for both transcriptome and genome evolution.

Population genomics of domestic and wild yeasts

Rather than one or two domestication events leading to the extant baker’s yeasts, the population structure of S. cerevisiae consists of a few well-defined, geographically isolated lineages and many different mosaics of these lineages, supporting the idea that human influence provided the opportunity for cross-breeding and production of new combinations of pre-existing variations.

Two decades of suspect evidence for adaptive DNA-sequence evolution – Less negative selection misconstrued as positive selection

The two approaches suggest that the variation in the strength of negative selection may be responsible for the bulk of the reported adaptive genome evolution in the last two decades.

Functional Bias and Demographic History Obscure Patterns of Selection among Single-Copy Genes in a Fungal Species Complex

It is found that two species have strongly negatively skewed Tajima’s D, while three other have a positive skew, corresponding well with patterns of demographic expansion and contraction, and an attempt is made to mitigate Gene Ontology term overrepresentation.



Inferring functional constraints and divergence in protein families using 3D mapping of phylogenetic information.

A knowledge-based framework in which the maximum likelihood rate of evolution is used to quantify the level of constraint on the identity of a site is proposed and it is shown that functionally divergent sites occur in a cluster of sites interacting with the catalytic residues.

Testing the neutral theory of molecular evolution with genomic data from Drosophila

The difference between polymorphism and divergence is limited to only a fraction of the genes, which are also evolving more rapidly, and this implies that positive selection is responsible, which suggests a rate of adaptive evolution that is far higher than permitted by the neutral theory of molecular evolution.

Adaptive protein evolution at the Adh locus in Drosophila

A simple statistical test of the neutral protein evolution hypothesis is proposed based on a comparison of the number of amino-acid replacement substitutions to synonymous substitutions in the coding region of a locus, finding that there are more fixed replacement differences between species than expected.

Rate variation of DNA sequence evolution in the Drosophila lineages.

The higher codon bias in Drosophila yakuba as compared with D. melanogaster and D. simulans was observed in the four AS-C genes, which suggests change(s) in action of natural selection involved in codon usage on these genes.

Adaptive protein evolution in Drosophila

It is estimated that 45% of all amino-acid substitutions have been fixed by natural selection, and that on average one adaptive substitution occurs every 45 years in these species.

Positive Darwinian selection after gene duplication in primate ribonuclease genes.

It was found that the number of arginine residues increased substantially in a short period of evolutionary time after gene duplication, and these amino acid changes probably produced the novel anti-pathogen function of ECP.

Evolution of transcription factor binding sites in Mammalian gene regulatory regions: conservation and turnover.

An analysis of the evolutionary dynamics of transcription factor binding sites whose function had been experimentally verified in promoters of 51 human genes and their sequence to homologous sequences in other primate species and rodents shows extensive divergence.

Episodic adaptive evolution of primate lysozymes

This approach can detect adaptive and purifying episodes, and localize them to specific lineages during protein evolution, and detect a previously unsuspected adaptive episode on the lineage leading to the common ancestor of the modern hominoid lysozymes.

Constant relative rate of protein evolution and detection of functional diversification among bacterial, archaeal and eukaryotic proteins

Relative rates of protein evolution are remarkably constant for the three species groups analyzed here, and deviations from this rate constancy are probably due to changes in selective constraints associated with diversification between orthologs.

Microevolutionary genomics of bacteria.

Different functional categories of genes were shown to evolve at significantly different rates emphasizing the role of category-specific functional constraints in determining evolutionary rates, suggesting the possibility that nonessential genes are responsible for driving the evolutionary diversification between strains.