An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins

@article{Martelli2003AnEM,
  title={An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins},
  author={Pier Luigi Martelli and Piero Fariselli and Rita Casadio},
  journal={Bioinformatics},
  year={2003},
  volume={19 Suppl 1},
  pages={
          i205-11
        }
}
MOTIVATION All-alpha membrane proteins constitute a functionally relevant subset of the whole proteome. Their content ranges from about 10 to 30% of the cell proteins, based on sequence comparison and specific predictive methods. Due to the paucity of membrane proteins solved with atomic resolution, the training/testing sets of predictive methods for protein topography and topology routinely include very few well-solved structures mixed with a hundred proteins known with low resolution… 
Evaluation of methods for predicting the topology of β-barrel outer membrane proteins and a consensus prediction method
TLDR
The consensus prediction method described in this work, optimizes the predicted topology with a dynamic programming algorithm and is implemented in a web-based application freely available to non-commercial users at http://bioinformatics.uoa.gr/ConBBPRED.
Improving the accuracy of transmembrane protein topology prediction using evolutionary information
TLDR
A new method (MEMSAT3) for predicting transmembrane protein topology from sequence profiles is described and benchmarked with full cross-validation on a standard data set of 184 trans Membrane proteins.
Benchmarking subcellular localization and variant tolerance predictors on membrane proteins
TLDR
The analysis indicated that the best tools had similar prediction performance on transmembrane, inside and outside regions of trans Membrane proteins and comparable to overall prediction performances for all types of proteins.
MetaTM - a consensus method for transmembrane protein topology prediction
TLDR
A novel TM consensus method, named MetaTM, which is based on support vector machine models and combines the results of six TM topology predictors and two signal peptide predictors, and has higher accuracy than a previous consensus predictor.
Enhanced membrane protein topology prediction using a hierarchical classification method and a new scoring function.
TLDR
A hierarchical classification method using support vector machines that integrates selected features by capturing the sequence-to-structure relationship and developing a new scoring function based on membrane protein folding that can facilitate the annotation of membrane proteomes to extract useful structural and functional information.
Transmembrane protein topology prediction using support vector machines
TLDR
The high accuracy of TM topology prediction which includes detection of both signal peptides and re-entrant helices, combined with the ability to effectively discriminate between TM and globular proteins, make this method ideally suited to whole genome annotation of alpha-helical transmembrane proteins.
Deep Conditional Random Field Approach to Transmembrane Topology Prediction and Application to GPCR Three-Dimensional Structure Modeling
TLDR
A novel deep approach based on conditional random fields named as dCRF-TM for predicting the topology of transmembrane proteins, which demonstrated a more robust performance on large size proteins (>350 residues) against 11 state-of-the-art predictors.
Transmembrane protein structure prediction using machine learning
TLDR
This thesis describes the development and application of machine learning-based methods for the prediction of alpha-helical transmembrane protein structure from sequence alone and achieves state-of-the-art performance in predicting topology and discriminating between globular and trans Membrane proteins.
Advances in Computational Methods for Transmembrane Protein Structure Prediction
TLDR
This chapter reviews the existing methods for the identification, topology prediction and three-dimensional modelling of TM proteins, including a discussion of the recent advances in identifying residue-residue contacts from large multiple sequence alignments that have enabled impressive gains to be made in the field of TM protein structure prediction.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 23 REFERENCES
A sequence-profile-based HMM for predicting and discriminating beta barrel membrane proteins
TLDR
A HMM model is developed, which can predict the topology of beta barrel membrane proteins using, as input, evolutionary information and the development of a specific input for HMM based on multiple sequence alignment is novel.
A model recognition approach to the prediction of all-helical membrane protein structure and topology.
TLDR
The method employs a set of statistical tables (log likelihoods) complied from well-characterized membrane protein data, and a novel dynamic programming algorithm to recognize membrane topology models by expectation maximization.
Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.
TLDR
A new membrane protein topology prediction method, TMHMM, based on a hidden Markov model is described and validated, and it is discovered that proteins with N(in)-C(in) topologies are strongly preferred in all examined organisms, except Caenorhabditis elegans, where the large number of 7TM receptors increases the counts for N(out)-C-in topologies.
Prediction of the transmembrane regions of β‐barrel membrane proteins with a neural network‐based predictor
A method based on neural networks is trained and tested on a nonredundant set of β‐barrel membrane proteins known at atomic resolution with a jackknife procedure. The method predicts the topography
Transmembrane helices predicted at 95% accuracy
TLDR
A neural network system that predicts the locations of transmembrane helices in integral membrane proteins by using evolutionary information as input to the network system significantly improved on a previously published neural network prediction method based on single sequence information.
Topology prediction for helical transmembrane proteins at 86% accuracy–Topology prediction at 86% accuracy
TLDR
The improvement is achieved by a dynamic programming‐like algorithm that optimizes helices compatible with the neural network output and the extension is the prediction of topology by applying to the refined prediction the observation that positively charged residues are more abundant in extra‐cytoplasmic regions.
Principles governing amino acid composition of integral membrane proteins: application to topology prediction.
TLDR
The method successfully predicted all the transmembrane segments in 143 proteins out of the 158, and for 135 of these proteins both the membrane spanning regions and the topologies were predicted correctly.
Transmembrane helix predictions revisited
TLDR
Surprisingly, it was found that proteins with more than five helices were predicted at a significantly lower accuracy than proteins with five or fewer, suggesting that structurally unsolved multispanning membrane proteins will remain problematic for transmembrane helix prediction algorithms.
Machine learning approaches for the prediction of signal peptides and other protein sorting signals.
TLDR
A hidden Markov model version of SignalP has been developed, making it possible to discriminate between cleaved signal peptides and uncleaved signal anchors, and it is shown how SignalP can be used to characterize putative signal peptide from an archaeon, Methanococcus jannaschii.
MaxSubSeq: an algorithm for segment-length optimization. The case study of the transmembrane spanning segments
TLDR
This paper describes a general dynamic programming-like algorithm specifically designed to optimize the number and length of segments with constrained length in a given protein sequence, and presents the detailed description of MaxSubSeq.
...
1
2
3
...