DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier

@article{Kulmanov2018DeepGOPP,
  title={DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier},
  author={Maxat Kulmanov and Mohammed Asif Khan and R. Hoehndorf},
  journal={Bioinformatics},
  year={2018},
  volume={34},
  pages={660 - 668}
}
Motivation A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. [...] Key Method The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. Results We have developed a novel method to predict protein function from sequence. We use deep learning to learn features…Expand
DeepAdd: Protein function prediction from k-mer embedding and additional features
TLDR
A new method to predict protein function from sequence was developed that uses the dependencies between GO classes as background information to construct a deep learning model and has noticeable improvement over several algorithms, such as FFPred, DeepGO, GoFDR and other methods compared on the CAFA3 datasets. Expand
SDN2GO: An Integrated Deep Learning Model for Protein Function Prediction
TLDR
An integrated deep-learning-based classification model, named SDN2GO, to predict protein functions, which outperforms others on each sub-ontology of GO and learns from the Natural Language Processing to process domain information and pre-trained a deep learning sub-model to extract the comprehensive features of domains. Expand
DeepFunc: A Deep Learning Framework for Accurate Prediction of Protein Functions from Protein Sequences and Interactions
TLDR
A novel deep learning framework, DeepFunc, is proposed which accurately predicts protein functions from protein sequence‐ and network‐derived information and outperforms current methods on the testing dataset and on the Critical Assessment of protein Function Annotation algorithms (CAFA) 3 dataset. Expand
Protein function prediction with gene ontology: from traditional to deep learning models
TLDR
This work reviewed the currently available computational GO annotation methods for proteins, ranging from conventional to deep learning approach, and selected some suitable predictors from among the reviewed tools and conducted a mini comparison of their performance using a worldwide challenge dataset. Expand
A deep learning framework for gene ontology annotations with sequence - and network-based information.
  • Fuhao Zhang, Hong Song, +4 authors Min Li
  • Medicine
  • IEEE/ACM transactions on computational biology and bioinformatics
  • 2020
TLDR
A deep learning framework to predict protein functions with protein sequences and protein-protein interaction (PPI) networks is proposed and the experimental results show that DeepGOA outperforms DeepGO and BLAST. Expand
DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks
TLDR
DEEPred, a hierarchical stack of multi-task feed-forward deep neural networks, is proposed as a solution to Gene Ontology based protein function prediction and the neural network architecture of DEEPred can also be applied to the prediction of the other types of ontological associations. Expand
DEEPGONET: Multi-Label Prediction of GO Annotation for Protein from Sequence Using Cascaded Convolutional and Recurrent Network
  • S. M. S. Islam, M. Hasan
  • Computer Science
  • 2018 21st International Conference of Computer and Information Technology (ICCIT)
  • 2018
TLDR
DEEPGONET is presented, a novel cascaded convolutional and recurrent neural network, to predict the top-level hierarchy of GO ontology, taking the primary sequence of protein as input, making it more useful than other prevailing state-of-the-art deep learning based methods with multi-modal input, which are less applicable for proteins where only primary sequence is available. Expand
A deep learning ensemble for function prediction of hypothetical proteins from pathogenic bacterial species
TLDR
A novel attempt amongst various existing machine learning based protein function prediction systems based on mixed organisms to categorize a bacterial hypothetical/unreviewed protein's function into 1739 GO terms as functional classes being fully dedicated to bacterial organisms. Expand
DeepPPF: A deep learning framework for predicting protein family
TLDR
A novel deep learning framework for predicting protein family, DeepPPF, which employs the word2vec technique in capturing distributional dependencies among nucleotides and discovers rich features from diverse motif lengths to characterize proteins is proposed. Expand
DeepGOPlus: improved protein function prediction from sequence
TLDR
A novel method for predicting protein functions from sequence alone which combines deep convolutional neural network (CNN) model with sequence similarity based predictions is developed and compared with state-of-the-art methods such as DeepText2GO and GOLabeler on another dataset. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 51 REFERENCES
FFPred 3: feature-based function prediction for all Gene Ontology domains
TLDR
This update features a larger SVM library that extends its coverage to the cellular component sub-ontology for the first time, prompted by the establishment of a dedicated evaluation category within the Critical Assessment of Functional Annotation. Expand
CombFunc: predicting protein function using heterogeneous data sources
TLDR
The CombFunc web server, which makes Gene Ontology (GO)-based protein function predictions, is presented, which incorporates ConFunc, the existing function prediction method, with other approaches for function prediction that use protein sequence, gene expression and protein–protein interaction data. Expand
Hierarchical Classification of Gene Ontology Terms Using the Gostruct Method
TLDR
This work proposes a method that directly predicts a full functional annotation of a protein by modeling the structure of the Gene Ontology hierarchy in the framework of kernel methods for structured-output spaces. Expand
Roles for text mining in protein function prediction.
TLDR
This chapter introduces two main strategies for association of function terms, represented as Gene Ontology terms, to proteins based on information in published articles, and a paradigm called LEAP-FS (Literature-Enhanced Automated Prediction of Functional Sites) in which literature mining is used to validate the predictions of an orthogonal computational protein function prediction method. Expand
STRING v10: protein–protein interaction networks, integrated over the tree of life
TLDR
H hierarchical and self-consistent orthology annotations are introduced for all interacting proteins, grouping the proteins into families at various levels of phylogenetic resolution in the STRING database. Expand
Information-theoretic evaluation of predicted ontological annotations
TLDR
An information-theoretic framework is proposed that uses a Bayesian network, structured according to the underlying ontology, to model the prior probability of a protein’s function and proposes a single statistic, referred to as semantic distance, that can be used to rank classification models. Expand
GoFDR: A sequence alignment based method for predicting protein functions.
TLDR
GoFDR is of great value not only for annotating protein functions in newly sequenced genomes, but also for characterizing the function of proteins of interest. Expand
Functional classification of CATH superfamilies: a domain-based approach for protein function annotation
TLDR
A domain- based method for protein function classification and prediction of functional sites that exploits functional sub-classification of CATH superfamilies, FunFHMMer, which generates more functionally coherent groupings of protein sequences than other domain-based protein classifications. Expand
Predicting Protein Function by Multi-Label Correlated Semi-Supervised Learning
  • J. Jiang, L. McQuay
  • Mathematics, Computer Science
  • IEEE/ACM Transactions on Computational Biology and Bioinformatics
  • 2012
TLDR
This work proposes a new algorithm, Multi-label Correlated Semi-supervised Learning (MCSL), to incorporate the intrinsic correlations among functional classes into protein function prediction by leveraging the relationships provided by the PPI network and the functional class network. Expand
Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model
TLDR
A new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks that greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Expand
...
1
2
3
4
5
...