Full-length transcriptome assembly from RNA-Seq data without a reference genome.

@article{Grabherr2011FulllengthTA,
  title={Full-length transcriptome assembly from RNA-Seq data without a reference genome.},
  author={Manfred G. Grabherr and Brian J. Haas and Moran Yassour and Joshua Z. Levin and Dawn A Thompson and Ido Amit and Xian Adiconis and Lin Fan and Raktima Raychowdhury and Qiandong Zeng and Zehua Chen and Evan Mauceli and Nir Hacohen and Andreas Gnirke and Nicholas Rhind and Federica Di Palma and Bruce W. Birren and Chad Nusbaum and Kerstin Lindblad-Toh and Nir Friedman and Aviv Regev},
  journal={Nature biotechnology},
  year={2011},
  volume={29 7},
  pages={
          644-52
        }
}
Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. [...] Key Result By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes.Expand

Figures and Topics from this paper

Extension of Partial Gene Transcripts by Iterative Mapping of RNA-Seq Raw Reads
TLDR
An effective method to improve the contiguity of partial transcripts in silico that, in the absence of a reference genome, may be a quick and cost-effective alternative to their extension by laboratory experimentation is presented. Expand
Transcriptome Analysis for Non-Model Organism: Current Status and Best-Practices
TLDR
An overview of the state-of-the-art methods including quality check and pre-processing of raw reads, the pros and cons of de novo transcriptome assemblers, generating non-redundant transcript data and further mining of transcriptomic data for particular biological questions are provided. Expand
Semantic Assembly and Annotation of Draft RNAseq Transcripts without a Reference Genome
TLDR
A computational workflow for the reconstruction and functional annotation of expressed gene transcripts that does not require a reference genome sequence and can be tolerant to low coverage, high error rates and other issues that often lead to poor results of de novo assembly in studies of non-model organisms is proposed. Expand
De novo transcript sequence reconstruction from RNA-Seq: reference generation and analysis with Trinity
TLDR
This protocol describes the use of the Trinity platform for de novo transcriptome assembly from RNA-Seq data in non-model organisms and presents Trinity’s supported companion utilities for downstream applications, including RSEM for transcript abundance estimation and R/Bioconductor packages for identifying differentially expressed transcripts across samples. Expand
SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads
TLDR
The conclusion is that SOAPdenovo-Trans provides higher contiguity, lower redundancy and faster execution, compared with two other popular transcriptome assemblers. Expand
De novo transcriptome assembly: A comprehensive cross-species comparison of short-read RNA-Seq assemblers
TLDR
A large-scale comparative study in which 10 de novo assembly tools are applied to 9 RNA-Seq data sets spanning different kingdoms of life, finding that Trinity, SPAdes, and Trans-ABySS, followed by Bridger and SOAPdenovo-Trans, generally outperformed the other tools compared. Expand
A Comparison of Next Generation Sequencing Technologies for Transcriptome Assembly and Utility for RNA-Seq in a Non-Model Bird
TLDR
In the absence of a reference genome, it is found that Illumina reads alone produced a high quality transcriptome appropriate for RNA-Seq gene expression analyses. Expand
rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data
TLDR
The novel transcriptome assembler rnaSPAdes, which has been developed on top of the SPAdes genome assembler, typically outperforms other assemblers by such important property as the number of assembled genes and isoforms and at the same time has higher accuracy statistics on average comparing to the closest competitors. Expand
Comparative performance of transcriptome assembly methods for non-model organisms
TLDR
This study provides general guidance for transcriptome assembly of RNA-Seq data from organisms with or without a sequenced genome, and emphasizes the efficacy of de novo assembly, which can be as effective as genome-guided assembly when the reference genome assembly is fragmented. Expand
FRAMA: from RNA-seq data to annotated mRNA assemblies
TLDR
A comparison with three different sources of naked mole-rat transcripts reveals that FRAMA’s gene models are better supported by RNA-seq data than any other transcript set, demonstrating the competitiveness of FRAMA to state of the art genome-based transcript reconstruction approaches. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 37 REFERENCES
Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution
TLDR
High-throughput sequencing of complementary DNAs (RNA-Seq) and strand-specific array data provide rich condition-specific information on novel, mostly non-coding transcripts, untranslated regions and gene structures, thus improving the existing genome annotation. Expand
Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing
TLDR
This work presents a general approach for ab initio discovery of the complete transcriptome of the budding yeast, based only on the unannotated genome sequence and millions of short reads from a single massively parallel sequencing run. Expand
De novo transcriptome assembly with ABySS
TLDR
This work assembled approximately 194 million reads using ABySS into 66 921 contigs 100 bp or longer, representing over 30 million base pairs of unique transcriptome sequence, or roughly 1% of the genome. Expand
Advancing RNA-Seq analysis
TLDR
New approaches for RNA-Seq analysis that capture genome-wide transcription and splicing in unprecedented detail are introduced, and a de novo assembly approach implemented in the ABySS software reduces the annotation problem to that of aligning full-length cDNAs, which is well handled by several algorithms. Expand
Comprehensive comparative analysis of strand-specific RNA sequencing methods
TLDR
A comprehensive computational pipeline is developed to compare library quality metrics from any RNA-seq method and identified the dUTP second-strand marking and the Illumina RNA ligation methods as the leading protocols, with the former benefitting from the current availability of paired-end sequencing. Expand
Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.
TLDR
The results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation. Expand
Ab initio reconstruction of cell type–specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs
TLDR
Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence, is presented and the power of ab initio reconstruction is demonstrated to render a comprehensive picture of mammalian transcriptomes. Expand
Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs
TLDR
Substantial variation in protein-coding genes is identified, including thousands of novel 5′-start sites, 3′-ends, and internal coding exons, and the gene structures of over a thousand lincRNA and antisense loci are determined. Expand
TopHat: discovering splice junctions with RNA-Seq
TLDR
The TopHat pipeline is much faster than previous systems, mapping nearly 2.2 million reads per CPU hour, which is sufficient to process an entire RNA-Seq experiment in less than a day on a standard desktop computer. Expand
ALLPATHS: de novo assembly of whole-genome shotgun microreads.
TLDR
A general method for genome assembly that can be applied to all types of DNA sequence data, not only short read data, but also conventional sequence reads is described. Expand
...
1
2
3
4
...