Correcting mistakes in predicting distributions
@article{MarotLassauzaie2018CorrectingMI, title={Correcting mistakes in predicting distributions}, author={Val{\'e}rie Marot-Lassauzaie and Michael Bernhofer and Burkhard Rost}, journal={Bioinformatics}, year={2018}, volume={34}, pages={3385 - 3386} }
Abstract Motivation Many applications monitor predictions of a whole range of features for biological datasets, e.g. the fraction of secreted human proteins in the human proteome. Results and error estimates are typically derived from publications. Results Here, we present a simple, alternative approximation that uses performance estimates of methods to error-correct the predicted distributions. This approximation uses the confusion matrix (TP true positives, TN true negatives, FP false…
Figures from this paper
6 Citations
Spectrum of protein localization in proteomes captures evolutionary relation between species
- BiologybioRxiv
- 2019
To gauge the bias of prediction methods, all available experimental annotations for the human proteome were merged and important values in both Swiss-Prot and the Human Protein Atlas were found.
Spectrum of Protein Location in Proteomes Captures Evolutionary Relationship Between Species
- BiologyJournal of molecular evolution
- 2021
Compared the location spectra for ten eukaryotes, known phylogenetic relations were reproduced better by paralog-only than by ortholog-only trees, and aspects of cross-species comparisons usually revealed only by much more detailed evolutionary comparisons were captured.
Detailed prediction of protein sub-nuclear localization
- BiologyBMC Bioinformatics
- 2019
A new method is presented that predicts nuclear substructures from sequence alone and predicts subnuclear compartments and traveler proteins accurately and carries important information about functionality and PPIs.
ProNA2020 predicts protein-DNA, protein-RNA and protein-protein binding proteins and residues from sequence.
- Biology, Computer ScienceJournal of molecular biology
- 2020
Detecting Novel Sequence Signals in Targeting Peptides Using Deep Learning
- BiologybioRxiv
- 2019
This work presents TargetP 2.0, a novel state of art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria and chloroplasts or other plastids, by examining the strongest signals from the attention layer in the network and finds that the second residue in the protein, i.e. the one following the initial methionine, has a strong influence on the classification.
Detecting sequence signals in targeting peptides using deep learning
- BiologyLife Science Alliance
- 2019
During the development of TargetP 2.0, a state-of-the-art method to predict targeting signal, we find a previously overlooked biological signal for subcellular targeting using the output from a deep…
References
SHOWING 1-9 OF 9 REFERENCES
Evaluation of transmembrane helix predictions in 2014
- BiologyProteins
- 2015
This work re‐examined the state of the art in transmembrane helix prediction based on a nonredundant dataset with 190 high‐resolution structures and found PolyPhobius and MEMSAT‐SVM outperformed other methods.
LocTree2 predicts localization for all domains of life
- BiologyBioinform.
- 2012
The resulting method, LocTree2, works well even for protein fragments, uses a hierarchical system of support vector machines that imitates the cascading mechanism of cellular sorting and compares favorably with top methods when tested on new data.
Transmembrane helix predictions revisited
- BiologyProtein science : a publication of the Protein Society
- 2002
Surprisingly, it was found that proteins with more than five helices were predicted at a significantly lower accuracy than proteins with five or fewer, suggesting that structurally unsolved multispanning membrane proteins will remain problematic for transmembrane helix prediction algorithms.
Hum‐mPLoc 3.0: prediction enhancement of human protein subcellular localization through modeling the hidden correlations of gene ontology and functional domain features
- Computer ScienceBioinform.
- 2017
A novel feature representation protocol denoted as HCM (Hidden Correlation Modeling), which will create more compact and discriminative feature vectors by modeling the hidden correlations between annotation terms.
MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction
- BiologyBMC Bioinformatics
- 2009
MultiLoc2 is an extensive high-performance subcellular protein localization prediction system that outperforms other prediction systems in two benchmarks studies and yields higher accuracies compared to its previous version.
UniProt Protein Knowledgebase.
- BiologyMethods in molecular biology
- 2017
This chapter introduces the functionality and data provided by UniProt, and describes example use cases for which you might come to UniProt and the methods to help you achieve your goals.
A subcellular map of the human proteome
- BiologyScience
- 2017
A subcellular map of the human proteome is presented to facilitate functional exploration of individual proteins and their role in human biology and disease and integrated into existing network models of protein-protein interactions for increased accuracy.
Evaluation of transmembrane helix predictions
- Proteins,
- 2015