• Publications
  • Influence
RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome
TLDR
It is shown that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads, and estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired- end reads, depending on the number of possible splice forms for each gene.
De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis
TLDR
This protocol provides a workflow for genome-independent transcriptome analysis leveraging the Trinity platform and presents Trinity-supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples and approaches to identify protein-coding genes.
Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project
TLDR
Functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project are reported, providing convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts.
Initial sequencing and comparative analysis of the mouse genome.
TLDR
The results of an international collaboration to produce a high-quality draft sequence of the mouse genome are reported and an initial comparative analysis of the Mouse and human genomes is presented, describing some of the insights that can be gleaned from the two sequences.
BUCKy: Gene tree/species tree reconciliation with Bayesian concordance analysis
TLDR
AVILABILITY BUCKy is a C++ program that implements Bayesian concordance analysis that uses a non-parametric clustering of genes with compatible trees, and reconstructs the primary concords from clades supported by the largest proportions of genes.
Population Genomics: Whole-Genome Analysis of Polymorphism and Divergence in Drosophila simulans
TLDR
A population genetic analysis of Drosophila simulans is presented based on whole-genome shotgun sequencing of multiple inbred lines and comparison of the resulting data to genome assemblies of the closely related species, D. melanogaster and D. yakuba, to suggest several new hypotheses regarding the genetic and biological mechanisms controlling polymorphism and divergence across the Drosophile genome.
The ENCODE (ENCyclopedia Of DNA Elements) Project
TLDR
The ENCyclopedia Of DNA Elements (ENCODE) Project is organized as an international consortium of computational and laboratory-based scientists working to develop and apply high-throughput approaches for detecting all sequence elements that confer biological function.
RNA-Seq gene expression estimation with read mapping uncertainty
TLDR
Simulations with the method indicate that a read length of 20–25 bases is optimal for gene-level expression estimation from mouse and maize RNA-Seq data when sequencing throughput is fixed, and the method is capable of modeling non-uniform read distributions.
De novo transcript sequence reconstruction from RNA-Seq: reference generation and analysis with Trinity
TLDR
This protocol describes the use of the Trinity platform for de novo transcriptome assembly from RNA-Seq data in non-model organisms and presents Trinity’s supported companion utilities for downstream applications, including RSEM for transcript abundance estimation and R/Bioconductor packages for identifying differentially expressed transcripts across samples.
Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution
TLDR
A draft genome sequence of the red jungle fowl, Gallus gallus, provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes.
...
...