• Publications
  • Influence
LinearFold: linear-time approximate RNA folding by 5'-to-3' dynamic programming and beam search
TLDR
This work is the first RNA folding algorithm to achieve linear runtime (and linear space) without imposing constraints on the output structure, and leads to significantly more accurate predictions on the longest sequence families in that database, as well as improved accuracies for long-range base pairs.
LinearPartition: linear-time approximation of RNA folding partition function and base-pairing probabilities
TLDR
This paper designs a similar linear-time heuristic algorithm, LinearPartition, to approximate the partition function and base-pairing probabilities, which is shown to be orders of magnitude faster than Vienna RNAfold and CONTRAfold (e.g. 2.5 days versus 1.3 min).
CoV-Seq, a New Tool for SARS-CoV-2 Genome Analysis and Visualization: Development and Usability Study
Background COVID-19 became a global pandemic not long after its identification in late 2019. The genomes of SARS-CoV-2 are being rapidly sequenced and shared on public repositories. To keep up with
LinearDesign: Efficient Algorithms for Optimized mRNA Sequence Design
TLDR
This work provides efficient computational tools to speed up and improve mRNA vaccine development and develops two algorithms for incorporating the codon optimality into the design, one based on k-best parsing to find alternative sequences and one directly incorporating codon optimization into the dynamic programming.
ThreshKnot: Thresholded ProbKnot for Improved RNA Secondary Structure Prediction
TLDR
It is suggested that ThreshKnot should replace MEA as the default partition function-based structure prediction algorithm in RNA structure prediction because of its higher structure prediction accuracy, its capability to predict pseudoknots, and its faster runtime and easier implementation.
CoV-Seq: SARS-CoV-2 Genome Analysis and Visualization
Summary COVID-19 has become a global pandemic not long after its inception in late 2019. SARS-CoV-2 genomes are being sequenced and shared on public repositories at a fast pace. To keep up with these
Learning to Fold RNAs in Linear Time
TLDR
A linear-time machine learning-based folding system, using recently proposed approximate folding tool LinearFold as inference engine, and structured SVM (sSVM) as training algorithm, and introduces a max violation update strategy to remedy non-convergence of naive sSVM with inexact search inference.
LinearTurboFold: Linear-Time Global Prediction of Conserved Structures for RNA Homologs with Applications to SARS-CoV-2
TLDR
LinearTurboFold is a linear-time algorithm that is orders of magnitude faster, making it the first method to simultaneously fold and align whole genomes of SARS-CoV-2 variants, the longest known RNA virus (∼30 kilobases).
Improved and Linear-Time Stochastic Sampling of RNA Secondary Structure with Applications to SARS-CoV-2
TLDR
LinearSampling is the first RNA structure sampling algorithm to scale up to the full-genome of SARS-CoV-2 without local window constraints, taking only 69.2 seconds on its reference sequence, and correlates well with the experimentally-guided structures.
CoV-Seq: SARS-CoV-2 Genome Analysis and Visualization.
BACKGROUND COVID-19 has become a global pandemic not long after its inception in late 2019. SARS-CoV-2 genomes are being sequenced and shared on public repositories at a fast pace. To keep up with
...
...