LinearFold: Linear-Time Prediction of RNA Secondary Structures

@article{Deng2018LinearFoldLP,
  title={LinearFold: Linear-Time Prediction of RNA Secondary Structures},
  author={Dezhong Deng and Kai Zhao and David A. Hendrix and David H. Mathews and Liang Huang},
  journal={bioRxiv},
  year={2018}
}
Predicting the secondary structure of an RNA sequence with speed and accuracy is useful in many applications such as drug design. The state-of-the-art predictors have a fundamental limitation: they have a run time that scales cubically with the length of the input sequence, which is slow for longer RNAs and limits the use of secondary structure prediction in genome-wide applications. To address this bottleneck, we designed the first linear-time algorithm for this problem. which can be used with… 
2 Citations
RNA secondary structure prediction with convolutional neural networks
TLDR
This paper shows how to use an artificial neural network design to predict the structure for a given RNA sequence with high accuracy only by learning from samples whose native structures have been experimentally characterized, independent of any energy model.
Convolutional models of RNA energetics
TLDR
A convolutional neural network architecture is presented to model the energies of nucleic acid motifs, allowing for learning of representations of physical interactions that generalize to arbitrary unmeasured motifs and showing that the model can accurately predict the thermodynamics of hairpins containing unme measured motifs.

References

SHOWING 1-10 OF 72 REFERENCES
Efficient parameter estimation for RNA secondary structure prediction
TLDR
This work presents constraint generation (CG), the first computational approach to RNA free energy parameter estimation that can be efficiently trained on large sets of structural as well as thermodynamic data and achieves significant improvements in prediction accuracy over current state of-the-art methods.
CONTRAfold: RNA secondary structure prediction without physics-based models
TLDR
Contrafold, a novel secondary structure prediction method based on conditional log-linear models (CLLMs), a flexible class of probabilistic models which generalize upon SCFGs by using discriminative training and feature-rich scoring, achieves the highest single sequence prediction accuracies to date.
IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming
TLDR
IPknot decomposes a pseudoknotted structure into a set of pseudoknot-free substructures and approximates a base-pairing probability distribution that considers Pseudoknots, leading to the capability of modeling a wide class of pseudOKnots and running quite fast.
Global or local? Predicting secondary structure and accessibility in mRNAs
TLDR
The results showed that local folding was more accurate than the classic global approach and the most robust performance, and introduced structure accuracy, a measure that is applicable to both global and local methods.
RNAz 2.0: Improved Noncoding RNA Detection
TLDR
RNAz 2.0 provides significant improvements in two respects: (1) the accuracy is increased by the systematic use of dinucleotide models, and (2) technical limitations of the previous version are overcome by increased training data and the usage of an entropy measure to represent sequence similarities.
A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more.
TLDR
TORNADO is a computational tool that can parse a wide spectrum of RNA grammar architectures using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores, and finds that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods.
The Trouble with Long-Range Base Pairs in RNA Folding
TLDR
It is demonstrated that the inclusion of a span-dependent penalty leads to improved maximum expected accuracy structure predictions compared to both the standard MEA model and a modified folding algorithm with an energy penalty function.
Improved prediction of RNA secondary structure by integrating the free energy model with restraints derived from experimental probing data
TLDR
A new RNA secondary structure prediction method, restrained MaxExpect (RME), which can incorporate multiple types of experimental probing data and is based on a free energy model and an MEA (maximizing expected accuracy) algorithm.
Predicting RNA structure: advances and limitations.
TLDR
This chapter describes how to use programs from the ViennaRNA Package to perform common tasks such as prediction of minimum free-energy structures, suboptimal structures, or base pairing probabilities, and generating secondary structure plots with reliability annotation.
Modeling RNA secondary structure folding ensembles using SHAPE mapping data
TLDR
Rsample, an algorithm for using experimental data to predict more than one RNA structure for sequences that populate multiple structures at equilibrium, is introduced and it is demonstrated, using SHAPE mapping data, that it can accurately model RNA sequences that populated multiple structures, including the relative probabilities of those structures.
...
...