Correcting mistakes in predicting distributions

@article{MarotLassauzaie2018CorrectingMI,
  title={Correcting mistakes in predicting distributions},
  author={Val{\'e}rie Marot-Lassauzaie and Michael Bernhofer and Burkhard Rost},
  journal={Bioinformatics},
  year={2018},
  volume={34},
  pages={3385 - 3386}
}
Abstract Motivation Many applications monitor predictions of a whole range of features for biological datasets, e.g. the fraction of secreted human proteins in the human proteome. Results and error estimates are typically derived from publications. Results Here, we present a simple, alternative approximation that uses performance estimates of methods to error-correct the predicted distributions. This approximation uses the confusion matrix (TP true positives, TN true negatives, FP false… 

Figures from this paper

Spectrum of protein localization in proteomes captures evolutionary relation between species
TLDR
To gauge the bias of prediction methods, all available experimental annotations for the human proteome were merged and important values in both Swiss-Prot and the Human Protein Atlas were found.
Spectrum of Protein Location in Proteomes Captures Evolutionary Relationship Between Species
TLDR
Compared the location spectra for ten eukaryotes, known phylogenetic relations were reproduced better by paralog-only than by ortholog-only trees, and aspects of cross-species comparisons usually revealed only by much more detailed evolutionary comparisons were captured.
Detailed prediction of protein sub-nuclear localization
TLDR
A new method is presented that predicts nuclear substructures from sequence alone and predicts subnuclear compartments and traveler proteins accurately and carries important information about functionality and PPIs.
Detecting Novel Sequence Signals in Targeting Peptides Using Deep Learning
TLDR
This work presents TargetP 2.0, a novel state of art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria and chloroplasts or other plastids, by examining the strongest signals from the attention layer in the network and finds that the second residue in the protein, i.e. the one following the initial methionine, has a strong influence on the classification.
Detecting sequence signals in targeting peptides using deep learning
During the development of TargetP 2.0, a state-of-the-art method to predict targeting signal, we find a previously overlooked biological signal for subcellular targeting using the output from a deep

References

SHOWING 1-9 OF 9 REFERENCES
Evaluation of transmembrane helix predictions in 2014
TLDR
This work re‐examined the state of the art in transmembrane helix prediction based on a nonredundant dataset with 190 high‐resolution structures and found PolyPhobius and MEMSAT‐SVM outperformed other methods.
LocTree2 predicts localization for all domains of life
TLDR
The resulting method, LocTree2, works well even for protein fragments, uses a hierarchical system of support vector machines that imitates the cascading mechanism of cellular sorting and compares favorably with top methods when tested on new data.
Transmembrane helix predictions revisited
TLDR
Surprisingly, it was found that proteins with more than five helices were predicted at a significantly lower accuracy than proteins with five or fewer, suggesting that structurally unsolved multispanning membrane proteins will remain problematic for transmembrane helix prediction algorithms.
Hum‐mPLoc 3.0: prediction enhancement of human protein subcellular localization through modeling the hidden correlations of gene ontology and functional domain features
TLDR
A novel feature representation protocol denoted as HCM (Hidden Correlation Modeling), which will create more compact and discriminative feature vectors by modeling the hidden correlations between annotation terms.
MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction
TLDR
MultiLoc2 is an extensive high-performance subcellular protein localization prediction system that outperforms other prediction systems in two benchmarks studies and yields higher accuracies compared to its previous version.
UniProt Protein Knowledgebase.
TLDR
This chapter introduces the functionality and data provided by UniProt, and describes example use cases for which you might come to UniProt and the methods to help you achieve your goals.
A subcellular map of the human proteome
TLDR
A subcellular map of the human proteome is presented to facilitate functional exploration of individual proteins and their role in human biology and disease and integrated into existing network models of protein-protein interactions for increased accuracy.
Evaluation of transmembrane helix predictions
  • Proteins,
  • 2015