Computational phosphorylation site prediction in plants using random forests and organism-specific instance weights

@article{Trost2013ComputationalPS,
  title={Computational phosphorylation site prediction in plants using random forests and organism-specific instance weights},
  author={Brett Trost and Anthony J. Kusalik},
  journal={Bioinformatics},
  year={2013},
  volume={29 6},
  pages={
          686-94
        }
}
MOTIVATION Phosphorylation is the most important post-translational modification in eukaryotes. Although many computational phosphorylation site prediction tools exist for mammals, and a few were created specifically for Arabidopsis thaliana, none are currently available for other plants. RESULTS In this article, we propose a novel random forest-based method called PHOSFER (PHOsphorylation Site FindER) for applying phosphorylation data from other organisms to enhance the accuracy of… 

Figures and Tables from this paper

Phosphorylation sites prediction using Random Forest
TLDR
RF-Phos 1.0, which uses random forest classifiers to integrate various sequence and structural features, is able to identify putative sites of phosphorylation across many protein families and compares favorably to other existing phosphosite prediction methods.
RF-Phos: A Novel General Phosphorylation Site Prediction Tool Based on Random Forest
TLDR
RF-Phos 2.0, which uses random forest with sequence and structural features, is able to identify putative sites of phosphorylation across many protein families and compares favorably to other popular mammalian phosphosite prediction methods.
Performance of Canonical Correlation Forest in Phosphorylation Site Predictions
TLDR
The development of a new predictor, termed Canonical Correlation Forest-based Phosphosite (CCF-Phos) predictor, to predict putative phosphorylation sites on a given protein.
Computational prediction and analysis of species-specific fungi phosphorylation via feature optimization strategy
TLDR
A novel method for prediction of species-specific fungi phosphorylation-PreSSFP was developed, which can identify fungiosphorylation in seven species for specific serine, threonine and tyrosine residues and compared it with other existing tools.
RF-Phos: Random forest-based prediction of phosphorylation sites
TLDR
An improved version of this method, termed RF-Phos 1.1, that employs additional sequence-driven features to identify putative sites of phosphorylation across many protein families performs comparably to or better than other existing phosphosite prediction methods, such as PhosphoSVM, GPS2.1 and Musite.
SKIPHOS: non-kinase specific phosphorylation site prediction with random forests and amino acid skip-gram embeddings
TLDR
This study introduces a non-kinase specific phosphorylation site prediction model based on random forests on top of a continuous distributed representation of amino acids that is compared to three recent methods including PhosphoSVM, iPhos-PreEn and RFPhos.
Phosphorylation site prediction
Prediction of protein kinase-specific phosphorylation sites in hierarchical structure using functional information and random forest
TLDR
This work conducts a systematic and hierarchy-specific investigation of protein phosphorylation site prediction in which protein kinases are clustered into hierarchical structures with four levels including kinase, subfamily, family and group and demonstrates that the proposed method remarkably outperforms existing phosphorylated prediction methods at all hierarchical levels.
PhospredRF: Prediction of protein phosphorylation sites using a consensus of random forest classifiers
TLDR
This research work has used machine learning based approaches to predict the position where phosphorylation has occurred and this system could attain the best performance for a set of 22 non-trivial proteins.
Application of Machine Learning Techniques to Predict Protein Phosphorylation Sites
TLDR
A comparison with the predictive performance of PhosphoSVM and Musite reveals that the prediction performance of the proposed method is better, and it has the advantages of simplicity, practicality and low time complexity in classification.
...
1
2
3
4
...

References

SHOWING 1-10 OF 51 REFERENCES
A New Machine Learning Approach for Protein Phosphorylation Site Prediction in Plants
TLDR
A new machine learning approach for phosphorylation site prediction is proposed, which incorporates protein sequence information and protein disordered regions, and integrate machine learning techniques of k-nearest neighbor and support vector machine for predicting phosphorylated sites.
Computational prediction of eukaryotic phosphorylation sites
TLDR
This review summarizes, categorizes and compares the available methods for phosphorylation site prediction, and provides an overview of the challenges that are faced when designing predictors and how they have been addressed.
PlantPhos: using maximal dependence decomposition to identify plant phosphorylation sites with substrate site specificity
TLDR
This work presents a novel method for identifying plant phosphorylation sites with various substrate motifs using maximal dependence decomposition (MDD), and results show that the MDD-clustered models outperform models trained without using MDD.
Machine learning approach to predict protein phosphorylation sites by incorporating evolutionary information
TLDR
A novel generalized prediction system, PPRED (P hosphorylation PRED ictor) is proposed that ignores the kinase information and only uses the evolutionary information of proteins for classifying phosphorylation sites.
PhosPhAt: a database of phosphorylation sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor
TLDR
A set of 802 experimentally validated serine phosphorylation sites are utilized to develop a method for prediction of serineosphorylation (pSer) in Arabidopsis and these prediction results are summarized graphically in the PhosPhAt database.
A summary of computational resources for protein phosphorylation.
TLDR
This review presents a comprehensive but brief summarization of computational resources of protein phosphorylation, includingosphorylation databases, prediction of non-specific or organism-specific phosphorylated sites, Prediction of kinase-specific phosphate sites or phospho-binding motifs, and other tools.
PhosPhAt: the Arabidopsis thaliana phosphorylation site database. An update
TLDR
The PhosPhAt database of Arabidopsis phosphorylation sites is now more of a web application with the inclusion of advanced search functions allowing combinatorial searches by Boolean terms and functional annotation of proteins using MAPMAN ontology.
PhosphoGRID: a database of experimentally verified in vivo protein phosphorylation sites from the budding yeast Saccharomyces cerevisiae
TLDR
A database of experimentally verified in vivo phosphorylation sites curated from the S. cerevisiae primary literature will provide a valuable benchmark for proteome-level studies and will facilitate bioinformatic analysis of cellular signal transduction networks.
GANNPhos: a new phosphorylation site predictor based on a genetic algorithm integrated neural network.
TLDR
Using a genetic algorithm integrated neural network (GANN), a new bioinformatics method named GANNPhos has been developed to predict phosphorylation sites in proteins, and when benchmarked against Back-Propagation neural network and Support Vector Machine algorithms, GannPhos gives better performance.
Sequence and structure-based prediction of eukaryotic protein phosphorylation sites.
TLDR
An artificial neural network method is presented that predicts phosphorylation sites in independent sequences with a sensitivity in the range from 69 % to 96 % and predicts novel phosphorylated sites in the p300/CBP protein that may regulate interaction with transcription factors and histone acetyltransferase activity.
...
1
2
3
4
5
...