DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier
@article{Kulmanov2017DeepGOPP, title={DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier}, author={Maxat Kulmanov and Mohammed Asif Khan and R. Hoehndorf}, journal={Bioinformatics}, year={2017}, volume={34}, pages={660 - 668} }
Motivation
A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. [] Key Method The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem.
Results
We have developed a novel method to predict protein function from sequence. We use deep learning to learn features…
258 Citations
DeepAdd: Protein function prediction from k-mer embedding and additional features
- Computer ScienceComput. Biol. Chem.
- 2020
SDN2GO: An Integrated Deep Learning Model for Protein Function Prediction
- Computer ScienceFrontiers in Bioengineering and Biotechnology
- 2020
An integrated deep-learning-based classification model, named SDN2GO, to predict protein functions, which outperforms others on each sub-ontology of GO and learns from the Natural Language Processing to process domain information and pre-trained a deep learning sub-model to extract the comprehensive features of domains.
DeepFunc: A Deep Learning Framework for Accurate Prediction of Protein Functions from Protein Sequences and Interactions
- Computer Science, BiologyProteomics
- 2019
A novel deep learning framework, DeepFunc, is proposed which accurately predicts protein functions from protein sequence‐ and network‐derived information and outperforms current methods on the testing dataset and on the Critical Assessment of protein Function Annotation algorithms (CAFA) 3 dataset.
A Deep Learning Framework for Gene Ontology Annotations With Sequence- and Network-Based Information
- Computer Science, BiologyIEEE/ACM Transactions on Computational Biology and Bioinformatics
- 2021
A deep learning framework to predict protein functions with protein sequences and protein-protein interaction (PPI) networks is proposed and the experimental results show that DeepGOA outperforms DeepGO and BLAST.
Protein function prediction with gene ontology: from traditional to deep learning models
- Computer SciencePeerJ
- 2021
This work reviewed the currently available computational GO annotation methods for proteins, ranging from conventional to deep learning approach, and selected some suitable predictors from among the reviewed tools and conducted a mini comparison of their performance using a worldwide challenge dataset.
Gene Ontology based protein functional annotation using pretrained embeddings
- Computer Science, Biology2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
- 2022
The experiment showed that protein embeddings created using pretrained transformer models can be used as a source of data for tasks involving sequence prediction, with a focus on protein functions.
Integrating unsupervised language model with triplet neural networks for protein gene ontology prediction
- Computer SciencePLoS Comput. Biol.
- 2022
A novel deep-learning method to predict Gene Ontology attributes of proteins through a triplet neural-network architecture embedded with pre-trained language models from protein sequences, demonstrating a new avenue for high-accuracy deep- learning function prediction that is applicable to large-scale protein function annotations from sequence alone.
PANDA2: protein function prediction using graph neural networks
- Computer ScienceNAR genomics and bioinformatics
- 2022
A deep learning system named PANDA2 was developed to predict protein functions, which used the cutting-edge graph neural network to model the topology of the GO DAG and integrated the features generated by transformer protein language models.
A Deep Learning Framework for Predicting Protein Functions With Co-Occurrence of GO Terms
- Computer ScienceIEEE/ACM Transactions on Computational Biology and Bioinformatics
- 2023
This work proposes a new deep learning model, named DeepPFP-CO, which uses Graph Convolutional Network (GCN) to explore and capture the co-occurrence of GO terms to improve the protein function prediction performance.
DeepGOZero: improving protein function prediction from sequence and zero-shot learning based on ontology axioms
- Computer SciencebioRxiv
- 2022
DeepGOZero is developed, a machine learning model which improves predictions for functions with no or only a small number of annotations and can exploit formal axioms in the GO to make zero-shot predictions, i.e., predict protein functions even if not a single protein in the training phase was associated with that function.
44 References
FFPred 3: feature-based function prediction for all Gene Ontology domains
- BiologyScientific Reports
- 2016
This update features a larger SVM library that extends its coverage to the cellular component sub-ontology for the first time, prompted by the establishment of a dedicated evaluation category within the Critical Assessment of Functional Annotation.
CombFunc: predicting protein function using heterogeneous data sources
- Biology, Computer ScienceNucleic Acids Res.
- 2012
The CombFunc web server, which makes Gene Ontology (GO)-based protein function predictions, is presented, which incorporates ConFunc, the existing function prediction method, with other approaches for function prediction that use protein sequence, gene expression and protein–protein interaction data.
Hierarchical Classification of Gene Ontology Terms Using the Gostruct Method
- Computer ScienceJ. Bioinform. Comput. Biol.
- 2010
This work proposes a method that directly predicts a full functional annotation of a protein by modeling the structure of the Gene Ontology hierarchy in the framework of kernel methods for structured-output spaces.
Roles for text mining in protein function prediction.
- BiologyMethods in molecular biology
- 2014
This chapter introduces two main strategies for association of function terms, represented as Gene Ontology terms, to proteins based on information in published articles, and a paradigm called LEAP-FS (Literature-Enhanced Automated Prediction of Functional Sites) in which literature mining is used to validate the predictions of an orthogonal computational protein function prediction method.
STRING v10: protein–protein interaction networks, integrated over the tree of life
- BiologyNucleic Acids Res.
- 2015
H hierarchical and self-consistent orthology annotations are introduced for all interacting proteins, grouping the proteins into families at various levels of phylogenetic resolution in the STRING database.
Information-theoretic evaluation of predicted ontological annotations
- Computer ScienceBioinform.
- 2013
An information-theoretic framework is proposed that uses a Bayesian network, structured according to the underlying ontology, to model the prior probability of a protein’s function and proposes a single statistic, referred to as semantic distance, that can be used to rank classification models.
Functional classification of CATH superfamilies: a domain-based approach for protein function annotation
- BiologyBioinform.
- 2016
A domain- based method for protein function classification and prediction of functional sites that exploits functional sub-classification of CATH superfamilies, FunFHMMer, which generates more functionally coherent groupings of protein sequences than other domain-based protein classifications.
Predicting Protein Function by Multi-Label Correlated Semi-Supervised Learning
- Computer ScienceIEEE/ACM Transactions on Computational Biology and Bioinformatics
- 2012
This work proposes a new algorithm, Multi-label Correlated Semi-supervised Learning (MCSL), to incorporate the intrinsic correlations among functional classes into protein function prediction by leveraging the relationships provided by the PPI network and the functional class network.
Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model
- Computer Science, BiologybioRxiv
- 2016
A new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks that greatly outperforms existing methods and leads to much more accurate contact-assisted folding.