Interpretable Structured Learning with Sparse Gated Sequence Encoder for Protein-Protein Interaction Prediction

  title={Interpretable Structured Learning with Sparse Gated Sequence Encoder for Protein-Protein Interaction Prediction},
  author={KC Kishan and Feng Cui and Anne R. Haake and Rui Li},
  journal={2020 25th International Conference on Pattern Recognition (ICPR)},
  • K. Kishan, F. Cui, Rui Li
  • Published 16 October 2020
  • Computer Science, Biology
  • 2020 25th International Conference on Pattern Recognition (ICPR)
Predicting protein-protein interactions (PPIs) by learning informative representations from amino acid sequences is a challenging yet important problem in biology. Although various deep learning models in Siamese architecture have been proposed to model PPIs from sequences, these methods are computationally expensive for a large number of PPIs due to the pairwise encoding process. Furthermore, these methods are difficult to interpret because of non-intuitive mappings from protein sequences to… 

Figures and Tables from this paper

Predicting Biomedical Interactions with Higher-Order Graph Convolutional Networks
This paper presents a higher-order graph convolutional network (HOGCN), which collects feature representations of neighbors at various distances and learns their linear mixing to obtain informative representations of biomedical entities.


Multifaceted protein–protein interaction prediction based on Siamese residual RCNN
An end-to-end framework, PIPR (Protein–Protein Interaction Prediction Based on Siamese Residual RCNN), for PPI predictions using only the protein sequences, which leverages both robust local features and contextualized information, which are significant for capturing the mutual influence of proteins sequences.
Predicting protein‐protein interactions through sequence‐based deep learning
A novel deep learning framework, DPPI, is presented, which efficiently applies a deep, Siamese‐like convolutional neural network combined with random projection and data augmentation to predict PPIs, leveraging existing high‐quality experimental PPI data and evolutionary information of a protein pair under prediction.
Prediction of protein-protein interactions from amino acid sequences using a novel multi-scale continuous and discontinuous feature set
A sequence-based approach is developed by combining a novel Multi-scale Continuous and Discontinuous (MCD) feature representation and Support Vector Machine (SVM) that can sufficiently capture multiple overlapping continuous and discontinuous binding patterns within a protein sequence.
More challenges for machine-learning protein interactions
The analyses suggest that PPIs square the challenge for this task and which prediction method appears to be best crucially depends on the sequence similarity between the test and the training set, how many true interactions should be found and the expected ratio of negatives to positives.
Predicting protein–protein interactions based only on sequences information
Different types of PPI networks have been effectively mapped with the proposed method, suggesting that, even with only sequence information, this method could be applied to the exploration of networks for any newly discovered protein with unknown biological relativity.
Evolutionary profiles improve protein-protein interaction prediction from sequence
A new approach to predict PPIs from sequence alone which is based on evolutionary profiles and profile-kernel support vector machines improves over the state-of-the-art, in particular for proteins that are sequence-dissimilar to proteins with known interaction partners.
Predicting protein-protein interactions using signature products
A very general, high-throughput method for predicting protein-protein interactions that combines a sequence-based description of proteins with experimental information that can be gathered from any type of protein- protein interaction screen.
Prediction of protein-protein interactions from protein sequence using local descriptors.
Given the complex nature of PPIs, the performance of the proposed sequence-based method is promising, and it can be a helpful supplement for PPIs prediction.
Fusing gene expressions and transitive protein-protein interactions for inference of gene regulatory networks
The GE and PPIN fusion model outperforms both the state-of-the-art single data source models (CLR, GENIE3, TIGRESS) as well as existing fusion models under various constraints.
Using support vector machine combined with auto covariance to predict protein–protein interactions from protein sequences
A sequence-based method is proposed by combining a new feature representation using auto covariance (AC) and support vector machine (SVM) and it can be a useful supplementary tool for future proteomics studies.