HMD-AMP: Protein Language-Powered Hierarchical Multi-label Deep Forest for Annotating Antimicrobial Peptides

  title={HMD-AMP: Protein Language-Powered Hierarchical Multi-label Deep Forest for Annotating Antimicrobial Peptides},
  author={Qinze Yu and Zhihang Dong and Xingyu Fan and Licheng Zong and Yu Li},
Identifying the targets of an antimicrobial peptide is a fundamental step in studying the innate immune response and combating antibiotic resistance, and more broadly, precision medicine and public health. There have been extensive studies on the statistical and computational approaches to identify (i) whether a peptide is an antimicrobial peptide (AMP) or a non-AMP and (ii) which targets are these sequences effective to (Gram-positive, Gram-negative, etc.). Despite the existing deep learning… 

HydrAMP: a deep generative model for antimicrobial peptide discovery

HydrAMP is proposed, a conditional variational autoencoder that learns a lower-dimensional and continuous space of peptides’ representations and captures their antimicrobial properties, enabling progress towards a new generation of antibiotics.

DeepAcr: Predicting Anti-CRISPR with Deep Learning

A novel deep learning method for anti-CRISPR analysis (DeepAcr), which achieves impressive performance and takes advantage of a Transformer protein language model pre-trained on 250 million protein sequences, which overcomes the data scarcity problem.

Emerging Computational Approaches for Antimicrobial Peptide Discovery

The motivation here is to bring up some non-standard peptide features that have been used to build classical ML predictive models and to highlight emerging ML algorithms and alternative computational tools to predict/design AMPs as well as to explore their chemical space.

ProNet DB: A proteome-wise database for protein surface property representations and RNA-binding profiles

This work proposes the first comprehensive database, namely ProNet DB, which incorporates multiple protein surface representations and RNA-binding landscape for more than 33,000 protein structures covering the proteome from AlphaFold Protein Structure Database (AlphaFold DB) and experimentally validated protein structures deposited in Protein Data Bank (PDB).



Deep learning regression model for antimicrobial peptide design

A convolutional neural network is designed to perform combined classification and regression on peptide sequences to quantitatively predict AMP activity against Escherichia coli and was effective at regression, for which there were no publicly available comparisons.

HMD-ARG: hierarchical multi-task deep learning for annotating antibiotic resistance genes

A hierarchical multi-task method, HMD-ARG, which is based on deep learning and can provide detailed annotations of ARGs from three important aspects: resistant antibiotic class, resistant mechanism, and gene mobility.

Deep learning improves antimicrobial peptide recognition

This work proposes a neural network model with convolutional and recurrent layers that leverage primary sequence composition and shows that the proposed model outperforms state-of-the-art classification models on a comprehensive dataset.

PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide Sequences

A peptide generation framework PepCVAE, based on a semi-supervised variational autoencoder (VAE) model, for designing novel antimicrobial peptide (AMP) sequences that generates novel AMP sequences with higher long-range diversity, while being closer to the training distribution of biological peptides.

CAMP: a useful resource for research on antimicrobial peptides

Collection of Anti-Microbial Peptides (CAMP) is a free online database that has been developed for advancement of the present understanding on antimicrobial peptides and will be a useful database for study of sequence-activity and -specificity relationships in AMPs.

DAMPD: a manually curated antimicrobial peptide database

The Dragon Antimicrobial Peptide Database is developed that contains 1232 manually curated AMPs and an integrated interface allows in a simple fashion querying based on taxonomy, species, AMP family, citation, keywords and a combination of search terms and fields (Advanced Search).

AmPEP: Sequence-based prediction of antimicrobial peptides using distribution patterns of amino acid properties and random forest

The optimal model, AmPEP with the 1:3 data ratio, showed high accuracy, Matthew’s correlation coefficient (MCC), and area under the receiver operating characteristic curve (AUC-ROC) of 0.9, and outperformed existing methods in terms of accuracy, MCC, and AUC- ROC when tested on benchmark datasets.

AntiBP2: improved version of antibacterial peptide prediction

Among antibacterial peptides, there is preference for certain residues at N and C terminus, which helps to discriminate them from non-antibacterial peptide and their further classification in source and family is studied.