Robust Identification of Noncoding RNA from Transcriptomes Requires Phylogenetically-Informed Sampling

  title={Robust Identification of Noncoding RNA from Transcriptomes Requires Phylogenetically-Informed Sampling},
  author={Stinus Lindgreen and Sinan Uğur Umu and Alicia Sook-Wei Lai and Hisham Eldai and Wenting Liu and Stephanie McGimpsey and Nicole E. Wheeler and Patrick J. Biggs and Nick R. Thomson and Lars Barquist and Anthony M. Poole and Paul P. Gardner},
  journal={PLoS Computational Biology},
Noncoding RNAs are integral to a wide range of biological processes, including translation, gene regulation, host-pathogen interactions and environmental sensing. While genomics is now a mature field, our capacity to identify noncoding RNA elements in bacterial and archaeal genomes is hampered by the difficulty of de novo identification. The emergence of new technologies for characterizing transcriptome outputs, notably RNA-seq, are improving noncoding RNA identification and expression… 

Figures from this paper

Comparative genomics provides structural and functional insights into Bacteroides RNA biology

This work investigates putative RNA‐binding proteins and predict a Bacteroides cold‐shock protein homolog to have an RNA‐related function, and applies an in silico protocol incorporating both sequence and structural analysis to determine the consensus structures and conservation of nine Bactseroides noncoding RNA families.

A high-resolution transcriptome map identifies small RNA regulation of metabolism in the gut microbe Bacteroides thetaiotaomicron

Bacteria of the genus Bacteroides are common members of the human intestinal microbiota and important degraders of polysaccharides in the gut. Among them, the species Bacteroides thetaiotaomicron has

Loss of Conserved Noncoding RNAs in Genomes of Bacterial Endosymbionts

It is found that the loss of cis-regulatory ncRNA sequences, which regulate the expression of cognate protein-coding genes, is characterized by the reduction of secondary structure formation propensity, GC content, and length of the corresponding genomic regions.

Origin, Evolution, and Loss of Bacterial Small RNAs.

The need for more-comprehensive analyses of sRNA evolutionary patterns is highlighted as a means to improve novel sRNA detection, enhance genome annotation, and deepen the understanding of regulatory networks in bacteria.

Accelerating Discovery and Functional Analysis of Small RNAs with New Technologies.

Recent developments in transcriptomics (RNA-seq) and functional genomics are described that are expected to help develop an integrated, systems-level view of sRNA biology in bacteria.

Annotating RNA motifs in sequences and alignments

This work presents a bioinformatic approach to characterise RNA motifs, which are the central building blocks of RNA structure and introduces a new profile-based database ofRNA motifs - RMfam - and illustrates its application for investigating the evolution and functional characterisation of RNA.

Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families

This unit introduces methods developed by the Rfam database for identifying “families” of homologous ncRNAs starting from single “seed” sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package.

Molecular phenotyping of infection-associated small non-coding RNAs

It is argued that new sequencing-based technologies can work around the problem of many known virulence factors failing to produce robust phenotypes by providing a ‘molecular phenotype’, defined in terms of the specific transcriptional dysregulation in the infection system induced by gene deletion.

RNA regulators responding to ribosomal protein S15 are frequent in sequence space

Of the six sequences the authors characterize, four show regulatory activity in an Escherichia coli reporter assay, suggesting that regulation in response to S15 is relatively easily acquired, and footprinting and mutagenesis analysis indicates that protein binding proximal to regulatory features is sufficient to enable regulation.

A Peroxide-Responding sRNA Evolved from a Peroxidase mRNA

The source from which OxyS, one of the most well-studied sRNAs, arose is described, protein-coding genes are identified as a potential raw material from which new s RNAs could emerge, and a novel evolutionary path is suggested through which newSRNAs could get incorporated into pre-existing regulatory networks.



RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data.

With the availability of genome-wide transcription data and massive comparative sequencing, the discrimination of coding from noncoding RNAs and the assessment of coding potential in evolutionarily

Studying bacterial transcriptomes using RNA-seq

Comparative Analysis of RNA Families Reveals Distinct Repertoires for Each Domain of Life

It is reported that 99% of known RNA families are restricted to a single domain of life, revealing discrete repertoires for each domain, and the majority of modern cellular RNA repertoires have primarily evolved in a domain-specific manner.

Stem cell transcriptome profiling via massive-scale mRNA sequencing

A massive-scale RNA sequencing protocol, short quantitative random RNA libraries or SQRL, is developed, highlighting how SQRL can be used to characterize transcriptome content and dynamics in a quantitative and reproducible manner, and suggesting that the understanding of transcriptional complexity is far from complete.

Approaches to querying bacterial genomes with transposon-insertion sequencing

An overview of studies that have examined the reproducibility and accuracy of these methods, as well as studies showing the advantages offered by the high resolution and dynamic range of high-throughput sequencing over previous methods, are provided.

RNAz 2.0: Improved Noncoding RNA Detection

RNAz 2.0 provides significant improvements in two respects: (1) the accuracy is increased by the systematic use of dinucleotide models, and (2) technical limitations of the previous version are overcome by increased training data and the usage of an entropy measure to represent sequence similarities.

Insights into the phylogeny and coding potential of microbial dark matter

This study applies single-cell genomics to target and sequence 201 archaeal and bacterial cells from nine diverse habitats belonging to 29 major mostly uncharted branches of the tree of life and provides a systematic step towards a better understanding of biological evolution on the authors' planet.

A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea

The results strongly support the need for systematic ‘phylogenomic’ efforts to compile a phylogeny-driven ‘Genomic Encyclopedia of Bacteria and Archaea’ in order to derive maximum knowledge from existing microbial genome data as well as from genome sequences to come.

Regulation by small RNAs in bacteria: expanding frontiers.