Mapping and quantifying mammalian transcriptomes by RNA-Seq

  title={Mapping and quantifying mammalian transcriptomes by RNA-Seq},
  author={Ali Mortazavi and Brian A. Williams and Kenneth McCue and Lorian Schaeffer and Barbara J. Wold},
  journal={Nature Methods},
We have mapped and quantified mouse transcriptomes by deeply sequencing them and recording how frequently each gene is represented in the sequence sample (RNA-Seq). This provides a digital measure of the presence and prevalence of transcripts from known and previously unknown genes. We report reference measurements composed of 41–52 million mapped 25-base-pair reads for poly(A)-selected RNA from adult mouse brain, liver and skeletal muscle tissues. We used RNA standards to quantify transcript… 

Spliced synthetic genes as internal controls in RNA sequencing experiments

A set of spike-in RNA standards, termed 'sequins' (sequencing spike-ins), that represent full-length spliced mRNA isoforms, that provide a qualitative and quantitative reference with which to navigate the complexity of the human transcriptome are developed.

Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among isoforms

The results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.

Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.

The results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.

Transcriptomics: Digging deep with RNA-Seq

Two new studies have surveyed the transcriptomes of model yeast species and identify large numbers of previously unknown transcripts, provide new information about the positions of promoters, exons and 3′ ends, and highlight the enormous level of transcript diversity that can be generated by alternative splicing in mammals.

Advancing RNA-Seq analysis

New approaches for RNA-Seq analysis that capture genome-wide transcription and splicing in unprecedented detail are introduced, and a de novo assembly approach implemented in the ABySS software reduces the annotation problem to that of aligning full-length cDNAs, which is well handled by several algorithms.

Accurate quantification of transcriptome from RNA-Seq data by effective length normalization

It is proposed that NEUMA could make a standard method in quantifying gene transcript levels from RNA-Seq data and offers a measure of consistency (‘consistency coefficient’) for each gene between an independently measured gene-wise level and the sum of the isoform levels.

RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome

It is shown that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads, and estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired- end reads, depending on the number of possible splice forms for each gene.

A single-molecule long-read survey of the human transcriptome

The results show the feasibility of deep sequencing full-length RNA from complex eukaryotic transcriptomes on a single-molecule level and high-confidence mappings are consistent with GENCODE annotations.

Isoform Expression Analysis Based on RNA-seq Data

This work focuses on methods for simultaneous transcript discovery and quantification in RNA-seq, and adds some recent developments in dealing with non-uniform read distribution within a transcript.



Tag-based approaches for transcriptome research and genome annotation

The 5′ end–specific tags, with their ability to identify transcripts along with their transcriptional start sites, will be of particular interest for gene network studies and may become one of the most important approaches in systems biology.

Shotgun sequencing of the human transcriptome with ORF expressed sequence tags.

Theoretical considerations predict that amplification of expressed gene transcripts by reverse transcription-PCR using arbitrarily chosen primers will result in the preferential amplification of the central portion of the transcript, and this approach should make a significant contribution to the early identification of important human genes.

Genome-Wide Survey of Human Alternative Pre-mRNA Splicing with Exon Junction Microarrays

These genome-wide data provide experimental evidence and tissue distributions for thousands of known and novel alternative splicing events and indicate that at least 74% of human multi-exon genes are alternatively spliced.

Global Identification of Human Transcribed Sequences with Genome Tiling Arrays

This work constructed a series of high-density oligonucleotide tiling arrays representing sense and antisense strands of the entire nonrepetitive sequence of the human genome and found 10,595 transcribed sequences not detected by other methods.

RNA Maps Reveal New RNA Classes and a Possible Function for Pervasive Transcription

Three potentially functional classes of RNAs have been identified, two of which are syntenically conserved and correlate with the expression state of protein-coding genes and support a highly interleaved organization of the human transcriptome.

The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC).

Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors.

Transcriptional Maps of 10 Human Chromosomes at 5-Nucleotide Resolution

The transcribed portions of the human genome are predominantly composed of interlaced networks of both poly A+ and poly A– annotated transcripts and unannotated transcripts of unknown function, which has important implications for interpreting genotype-phenotype associations, regulation of gene expression, and the definition of a gene.

Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays

A novel sequencing approach that combines non-gel-based signature sequencing with in vitro cloning of millions of templates on separate 5 μm diameter microbeads provides an unprecedented depth of analysis permitting application of powerful statistical techniques for discovery of functional relationships among genes.

Significance of rare mRNA sequences in liver