A novel min-cost flow method for estimating transcript expression with RNA-Seq

@article{Tomescu2013ANM,
  title={A novel min-cost flow method for estimating transcript expression with RNA-Seq},
  author={Alexandru I. Tomescu and A. Kuosmanen and R. Rizzi and V. M{\"a}kinen},
  journal={BMC Bioinformatics},
  year={2013},
  volume={14},
  pages={S15 - S15}
}
BackgroundThrough transcription and alternative splicing, a gene can be transcribed into different RNA sequences (isoforms), depending on the individual, on the tissue the cell is in, or in response to some stimuli. Recent RNA-Seq technology allows for new high-throughput ways for isoform identification and quantification based on short reads, and various methods have been put forward for this non-trivial problem.ResultsIn this paper we propose a novel radically different method based on… Expand
A Novel Combinatorial Method for Estimating Transcript Expression with RNA-Seq: Bounding the Number of Paths
TLDR
This paper implements three optimizations and heuristics, which achieve better performance on real data, and similar or betterperformance on simulated data, than state-of-the-art tools Cufflinks, IsoLasso and SLIDE. Expand
RNA Transcript Assembly Using Inexact Flows
TLDR
The proposed method is the first approach to this problem that explicitly controls the error allowed on each edge in these graphs in order to achieve a flow, and preliminary results on simulated biological data sets show that in many cases the ground truth paths can be recovered at approximately correct abundances, even with noisy input data. Expand
Efficient RNA isoform identification and quantification from RNA-Seq data with network flows
TLDR
This work introduces a new technique called FlipFlop, which can efficiently tackle the sparse estimation problem on the full set of candidate isoforms by using network flow optimization, leading to better isoform identification while keeping a low computational cost. Expand
On using Longer RNA-seq Reads to Improve Transcript Prediction Accuracy
TLDR
It is shown that, under hypothetical conditions of perfect sequencing, the solution to the Minimum Path Cover with Subpath Constraints problem is able to use long reads more effectively than two state-of-the-art tools, StringTie and FlipFlop. Expand
Ryūtō: network-flow based transcriptome reconstruction
TLDR
An extension of the common splice graph framework is proposed that combines aspects of overlap and bin graphs and makes it possible to efficiently use both multi-splice and paired-end information to the fullest extent and compares favorably with state of the art methods on both simulated and real-life datasets. Expand
Explaining a Weighted DAG with Few Paths for Solving Genome-Guided Multi-Assembly
TLDR
The approximability of this problem is studied, and a fully polynomial-time approximation scheme (FPTAS) is given for the case when the fitting function penalizes the maximum ratio between the weights of the arcs and their predicted coverage. Expand
A convex formulation for joint RNA isoform detection and quantification from multiple RNA-seq samples
TLDR
A new method for solving the isoform deconvolution problem jointly across several samples is proposed, formulated in a convex optimization problem that allows to share information between samples and that solves efficiently. Expand
ASGAL: aligning RNA-Seq data to a splicing graph to detect novel alternative splicing events
TLDR
To the best of the knowledge, ASGAL is the first tool that detects novel alternative splicing events by directly aligning reads to a splicing graph. Expand
ASGAL: Aligning RNA-Seq Data to a Splicing Graph to Detect Novel Alternative Splicing Events
TLDR
ASGAL (Alternative Splicing Graph ALigner): a tool for mapping RNA-Seq data to the splicing graph, is presented, which is the first tool that detects novel alternative splicing events by directly aligning reads to a splicinggraph. Expand
IsoTree: De Novo Transcriptome Assembly from RNA-Seq Reads - (Extended Abstract)
TLDR
IsoTree is presented, a novel framework for transcripts reconstruction in the absence of reference genomes that constructs splicing graph by connecting reads directly and performs better in recall on both pair- end reads and single-end reads and in precision on pair- End reads compared to other leading transcript assembly programs including Cufflinks, StringTie and BinPacker. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 34 REFERENCES
Inference of Isoforms from Short Sequence Reads
TLDR
A method to calculate the expression levels of isoforms and infer isoforms from short RNA-Seq reads using exon-intron boundary, transcription start site (TSS) and poly-A site (PAS) information and an efficient algorithm (called IsoInfer) to search for isoforms is proposed. Expand
Splicing graphs and EST assembly problem
TLDR
An algorithm is designed to assemble EST reads into the splicing graph rather than assembling them into each splicing variant in a case-by-case fashion using the notion of thesplicing graph, a natural and convenient representation of all splicing variants. Expand
TopHat: discovering splice junctions with RNA-Seq
TLDR
The TopHat pipeline is much faster than previous systems, mapping nearly 2.2 million reads per CPU hour, which is sufficient to process an entire RNA-Seq experiment in less than a day on a standard desktop computer. Expand
Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.
TLDR
The results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation. Expand
The multiassembly problem: reconstructing multiple transcript isoforms from EST fragment mixtures.
TLDR
An Alternatively Spliced Proteins database (ASP) is constructed from analysis of human expressed and genomic sequences, consisting of 13,384 protein isoforms of 4422 genes, yielding an average of 3.0 protein isoform sequences per gene. Expand
IsoLasso: A LASSO Regression Approach to RNA-Seq Based Transcriptome Assembly
TLDR
IsoLasso is a new RNA-Seq based transcriptome assembly tool based on the well-known LASSO algorithm, a multivariate regression method designated to seek a balance between the maximization of prediction accuracy and the minimization of interpretation. Expand
Sparse linear modeling of next-generation mRNA sequencing (RNA-Seq) data for isoform discovery and abundance estimation
TLDR
A statistical method called SLIDE that takes exon boundaries and RNA-Seq data as input to discern the set of mRNA isoforms that are most likely to present in an RNA- Seq sample, and performs as well as or better than major competitors in both isoform discovery and abundance estimation. Expand
CLIIQ: Accurate Comparative Detection and Quantification of Expressed Isoforms in a Population
TLDR
This work proposes CLIIQ, a novel computational method for identification and quantification of expressed isoforms from multiple samples in a population based on an integer linear programming formulation for identifying and quantifying "the most parsimonious" set of isoforms. Expand
Mapping and quantifying mammalian transcriptomes by RNA-Seq
TLDR
Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors. Expand
Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs
TLDR
Substantial variation in protein-coding genes is identified, including thousands of novel 5′-start sites, 3′-ends, and internal coding exons, and the gene structures of over a thousand lincRNA and antisense loci are determined. Expand
...
1
2
3
4
...