Selection for short introns in highly expressed genes

  title={Selection for short introns in highly expressed genes},
  author={Cristian I. Castillo-Davis and Sergei L. Mekhedov and Daniel L. Hartl and Eugene V. Koonin and Fyodor A. Kondrashov},
  journal={Nature Genetics},
Transcription is a slow and expensive process: in eukaryotes, approximately 20 nucleotides can be transcribed per second at the expense of at least two ATP molecules per nucleotide. Thus, at least for highly expressed genes, transcription of long introns, which are particularly common in mammals, is costly. Using data on the expression of genes that encode proteins in Caenorhabditis elegans and Homo sapiens, we show that introns in highly expressed genes are substantially shorter than those in… 
Evidence against the energetic cost hypothesis for the short introns in highly expressed genes
Qualitative estimation shows that the deleterious effect of long introns in highly expressed genes is too negligible to be efficiently selected against in mammals and should not be attributed to energy constraint.
Higher frequency of intron loss from the promoter proximally paused genes of Drosophila melanogaster
It is suggested that transcription delay is comparable to splicing delay only when the intron is 28.5 kb or larger, which is greater in size than 95% of vertebrate introns, 99.5% of Drosophila introns and all the annotated introns of Saccharomyces cerevisiae and Arabidopsis thaliana.
Gametophytic Selection in Arabidopsis thaliana Supports the Selective Model of Intron Length Reduction
Observations support the view that selection for efficiency contributes to the reduction in intron length and provide the first report of a molecular signature of strong gametophytic selection.
The Small Introns of Antisense Genes Are Better Explained by Selection for Rapid Transcription Than by “Genomic Design”
It is shown that the effects of the economy model arguing that the time that expression takes is more important than the energetic cost, such that some weakly but rapidly expressed genes might also have small introns are not specific to noncoding RNAs and that the predictions of the “genomic design” model for the most part are not upheld.
Intron Length Coevolution across Mammalian Genomes
The results reveal a novel aspect of gene coevolution and provide a means to identify genes, protein complexes and biological processes that may be particularly sensitive to changes in transcriptional dynamics.
Relationship between Gene Compactness and Base Composition in Rice and Human Genome
It is suggested that in GC-rich rice genes long introns are under selection for enhancing transcriptional efficiency by modulating pre-mRNA secondary structural stability and evolutionary mechanisms behind genome organization are different between these two genomes.
Analysis of intronless genes involved in oscillation and differentiation
This research set out to elucidate the functions of intronless genes in humans by studying their involvement in the expression pattern of oscillatory gene that occurs in the pre-somitic mesoderm of developing embryo.
Selection for the miniaturization of highly expressed genes.
Adaptation of codon and amino acid use for translational functions in highly expressed cricket genes
A model whereby codon use in highly expressed genes, including optimal, wobble, and non-optimal codons, and their tRNA abundances, as well as amino acid use, have been influenced by adaptation for various functional roles in translation within this cricket is suggested.
Abundance of dinucleotide repeats and gene expression are inversely correlated: a role for gene function in addition to intron length.
It is observed that even after controlling for the effects of GC and average intron lengths, the effect of repeats albeit somewhat weaker was persistent and definite, and suggest that negative selection of (TG/CA)(n>or=12) microsatellites in the evolution of the highly expressed genes was also controlled by gene function in addition to intron length.


Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis.
  • L. Duret, D. Mouchiroud
  • Biology
    Proceedings of the National Academy of Sciences of the United States of America
  • 1999
Surprisingly, there is a strong negative correlation between codon usage and protein length and this puzzling observation raises the question of how translation efficiency affects fitness in multicellular organisms.
Synonymous codon bias is related to gene length in Escherichia coli: selection for translational accuracy?
The levels of synonymous codon bias is shown to be positively correlated to gene length in Escherichia coli genes which are thought to be expressed at similar levels; it is argued that the positive correlation could be caused by selection to avoid missense errors during translation.
Intron-exon structures of eukaryotic model organisms.
The variable intron-exon structures of the 10 model organisms reveal two interesting statistical phenomena, which cast light on some previous speculations about genome size and intron size.
High intrinsic rate of DNA loss in Drosophila
Phylogenetic analysis of a non-LTR element, Helena, demonstrates that copies lose DNA at an unusually high rate, suggesting that lack of pseudogenes in Drosophila is the product of rampant deletion of DNA in unconstrained regions, and has important implications for the study of genome evolution in general and the 'C-value paradox' in particular.
Genome size and intron size in Drosophila.
The authors suggested that the paucity of pseudogenes in Drosophila is the product of rampant deletion of DNA in regions not subjected to selective constraints, and extrapolated that different deletion rates may contribute to the divergence in genome size among taxa.
The correlation between intron length and recombination in drosophila. Dynamic equilibrium between mutational and selective forces.
The study of the proposed dynamic model, taking into account interference among selected sites, might shed light on many aspects of the comparative biology of genome sizes including the C value paradox.
Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae
Examination of data from three independent studies that used oligonucleotide arrays or SAGE to estimate mRNA concentrations for nearly all genes in the genome found a strikingly unequal usage of different synonymous codons, in five Saccharomyces cerevisiae nuclear genes having high protein levels.
The protein Aly links pre-messenger-RNA splicing to nuclear export in metazoans
It is shown that Aly, the metazoan homologue of the yeast mRNA export factor Yra1p, is recruited to messenger ribonucleoprotein (mRNP) complexes generated by splicing.
Genomic analysis of gene expression in C. elegans.
The results provide an estimate of the number of expressed genes in the nematode, reveal relations between gene function and gene expression that can guide analysis of uncharacterized worm genes, and demonstrate a shift in expression from evolutionarily conserved genes to worm-specific genes over the course of development.