Achieving 80% ten‐fold cross‐validated accuracy for secondary structure prediction by large‐scale training

@article{Dor2007Achieving8T,
  title={Achieving 80\% ten‐fold cross‐validated accuracy for secondary structure prediction by large‐scale training},
  author={Ofer Dor and Yaoqi Zhou},
  journal={Proteins: Structure},
  year={2007},
  volume={66}
}
  • O. Dor, Yaoqi Zhou
  • Published 18 December 2006
  • Computer Science, Medicine
  • Proteins: Structure
An integrated system of neural networks, called SPINE, is established and optimized for predicting structural properties of proteins. SPINE is applied to three‐state secondary‐structure and residue‐solvent‐accessibility (RSA) prediction in this paper. The integrated neural networks are carefully trained with a large dataset of 2640 chains, sequence profiles generated from multiple sequence alignment, representative amino acid properties, a slow learning rate, overfitting protection, and an… Expand
SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles
TLDR
A multistep neural‐network algorithm was developed by coupling secondary structure prediction with prediction of solvent accessibility and backbone torsion angles in an iterative manner by applying SPINE X to a dataset of 2640 proteins and achieving 82.0% accuracy based on 10‐fold cross validation. Expand
Predicting residue–residue contact maps by a two‐layer, integrated neural‐network method
TLDR
A neural network method (SPINE‐2D) is introduced to provide a sequence‐based prediction of residue–residue contact maps via large‐scale training with overfit protection and a two‐layer neural network. Expand
Real‐SPINE: An integrated system of neural networks for real‐value prediction of protein structural properties
TLDR
An integrated system of neural networks is established for real‐value prediction and the method to predict residue‐solvent accessibility and backbone ψ dihedral angles of proteins based on information derived from sequences only is applied. Expand
Improving the prediction accuracy of residue solvent accessibility and real‐value backbone torsion angles of proteins by guided‐learning through a two‐layer neural network
TLDR
The guided‐learning method is introduced, designed to satisfy the intuitive condition that for most residues, the contribution of a residue to the structural properties of another residue is smaller for greater separation in the protein‐sequence distance between the two residues, and makes a 2–4% reduction in 10‐fold cross‐validated mean absolute errors. Expand
DNSS2: Improved ab initio protein secondary structure prediction using advanced deep learning architectures
TLDR
Multiple advanced deep learning architectures (DNSS2) are developed to further improve secondary structure prediction of protein secondary structure and was systematically benchmarked on independent test data sets with eight state‐of‐art tools and consistently ranked as one of the best methods. Expand
Protein secondary structure prediction using a small training set (compact model) combined with a Complex-valued neural network approach
TLDR
The choice of training proteins is important in preserving the generalization of a classifier to predict new sequences accurately and SSP techniques sensitive in distinguishing between backbone hydrogen bonding and side-chain or water-mediated hydrogen bonding might be needed in the reduction of Coil ⇔ Sheet misclassifications. Expand
Context-Based Features Enhance Protein Secondary Structure Prediction Accuracy
TLDR
The computational results have shown that the context-based scores are effective features to enhance the prediction accuracy of secondary structure predictions, and the three- and eight-state prediction servers implementing the methods are available online. Expand
Multifaceted analysis of training and testing convolutional neural networks for protein secondary structure prediction
TLDR
A detailed ablation study of different factors suggests the best accuracy results are achieved with good choices for each of them while the neural network architecture is not as critical as long as it is not too simple. Expand
Prediction of Protein Secondary Structure Using Feature Selection and Analysis Approach
TLDR
A novel method that uses binomial distribution to optimize tetrapeptide structural words and increment of diversity with quadratic discriminant to perform prediction for protein three-state secondary structure is proposed and results suggest that the feature selection technique can detect the optimized tetrapeptic structural words which affect the accuracy of predicted secondary structures. Expand
Sixty-five years of the long march in protein secondary structure prediction: the final stretch?
TLDR
The time has come to finish off the final stretch of the long march towards protein secondary structure prediction as more powerful deep learning methods with improved capability of capturing long-range interactions begin to emerge as the next generation of techniques forsecondary structure prediction. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 73 REFERENCES
Prediction of protein secondary structure at 80% accuracy
TLDR
Secondary structure prediction involving up to 800 neural network predictions has been developed, by use of novel methods such as output expansion and a unique balloting procedure, and with respect to blind prediction, this work is preliminary and awaits evaluation by CASP4. Expand
Combining prediction of secondary structure and solvent accessibility in proteins
TLDR
It is concluded that an increase in the 3‐state classification accuracy may be achieved when combining RSA with a state‐of‐the‐art protocol utilizing evolutionary profiles, as well as for prediction protocols that implicitly account for RSA in other ways. Expand
Predicting protein secondary structure and solvent accessibility with an improved multiple linear regression method
TLDR
The multiple linear regression algorithm for protein secondary structure prediction is improved by combining it with the evolutionary information provided by multiple sequence alignment of PSI‐BLAST by improving overall per‐residue accuracy and relative solvent accessibility prediction. Expand
Prediction of protein secondary structure at better than 70% accuracy.
TLDR
A two-layered feed-forward neural network is trained on a non-redundant data base to predict the secondary structure of water-soluble proteins with a new key aspect is the use of evolutionary information in the form of multiple sequence alignments that are used as input in place of single sequences. Expand
Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles
TLDR
Ensembles of bidirectional recurrent neural network architectures, PSI‐BLAST‐derived profiles, and a large nonredundant training set are used to derive two new predictors for secondary structure predictions, and confusion matrices are reported. Expand
Real value prediction of solvent accessibility in proteins using multiple sequence alignment and secondary structure
TLDR
A server SARpred has been developed that predicts the real value of solvent accessibility of residues for a given protein sequence and achieves a correlation coefficient of 0.68 and a MAE of 15.9% between predicted and observed values. Expand
Two‐stage support vector regression approach for predicting accessible surface areas of amino acids
TLDR
A two‐stage support vector regression (SVR) approach is proposed to predict real values of ASA from the position‐specific scoring matrices generated from PSI‐BLAST profiles by adding SVR as the second stage to capture the influences on the ASA value of a residue by those of its neighbors. Expand
Application of multiple sequence alignment profiles to improve protein secondary structure prediction
The effect of training a neural network secondary structure prediction algorithm with different types of multiple sequence alignment profiles derived from the same sequences, is shown to provide aExpand
Prediction of protein relative solvent accessibility with a two‐stage SVM approach
TLDR
A two‐stage approach with support vector machines (SVMs) is proposed, where an SVM predictor is introduced to the output of the single‐stage SVM approach to take into account the contextual relationships among solvent accessibilities for the prediction of protein RSA prediction. Expand
NETASA: neural network based prediction of solvent accessibility
TLDR
A server, NETASA is implemented for predicting solvent accessibility of amino acids using a newly optimized neural network algorithm, and applicability of neural networks for ASA prediction has been confirmed with a larger data set and wider range of state thresholds. Expand
...
1
2
3
4
5
...