CHERRY: a Computational metHod for accuratE pRediction of virus–pRokarYotic interactions using a graph encoder–decoder model

  title={CHERRY: a Computational metHod for accuratE pRediction of virus–pRokarYotic interactions using a graph encoder–decoder model},
  author={Jiayu Shang and Yanni Sun},
  journal={Briefings in Bioinformatics},
Abstract Prokaryotic viruses, which infect bacteria and archaea, are key players in microbial communities. Predicting the hosts of prokaryotic viruses helps decipher the dynamic relationship between microbes. Experimental methods for host prediction cannot keep pace with the fast accumulation of sequenced phages. Thus, there is a need for computational host prediction. Despite some promising results, computational host prediction remains a challenge because of the limited known interactions and… 

Comparative Genomics of Closely-Related Gordonia Cluster DR Bacteriophages

It is suggested that BiggityBass (as well as several of its close relatives) is likely able to infect host bacteria from a wide range of genera—from Gordonia to Nocardia to Rhodococcus, making it a suitable candidate for future phage therapy and wastewater treatment strategies.

PhaBOX: A web server for identifying and characterizing phage contigs in metagenomic data

This is the first web server that supports integrated phage analysis, including detecting phage contigs from the metagenomic assembly, lifestyle prediction, taxonomic classification, and host prediction, and PhaBOX also supports visualization of the essential features for making predictions.

iPHoP: An integrated machine learning framework to maximize host prediction for metagenome-derived viruses of archaea and bacteria

iHoP is described, a two-step framework that integrates multiple methods to reliably predict host taxonomy at the genus rank for a broad range of viruses infecting bacteria and archaea, while retaining a low false discovery rate.

PhaVIP: Phage VIrion Protein classification based on chaos game representation and Vision Transformer

This work investigated two applications that can use the output of PhaVIP: phage taxonomy classification and phage host prediction and the results showed the bene⬁t of using classi fied proteins over all proteins.

Advances in the field of phage-based therapy with special emphasis on computational resources

Allphage-associated prediction methods that include the prediction of phages for a bacterial strain, the host for a phage and the identification of interacting phage-host pairs are compiled.

PhaTYP: predicting the lifestyle for bacteriophages using BERT

PhaTYP (Phage TYPe prediction tool) is developed to improve the accuracy of lifestyle prediction on short contigs and the utility of Pha TYP for analyzing the phage lifestyle on human neonates’ gut data is demonstrated.

Expansion of Colorectal Cancer Biomarkers Based on Gut Bacteria and Viruses

Novel microbiota combinations with discrimination for colorectal tumor are identified, and the potential interactions of gut bacteria with viruses in the adenoma-carcinoma sequence is revealed, which implies that the microbiome, but not only bacteria, should be paid more attention in further studies.

iPHoP: an integrated machine-learning framework to maximize host prediction for metagenome-assembled virus genomes

iHoP is described, a two-step framework that integrates multiple methods to provide host predictions for a broad range of viruses while retaining a low (<10%) false-discovery rate and illustrated how iPHoP can provide extensive host prediction and guide further characterization of uncultivated viruses.

MetaHiC phage-bacteria infection network reveals active cycling phages of the healthy human gut

Bacteriophages play important roles in regulating the intestinal human microbiota composition, dynamics, and homeostasis, and characterizing their bacterial hosts is needed to understand their

Major bacterial lineages are essentially devoid of CRISPR-Cas viral defence systems

It is shown that entire lineages of uncultivated microorganisms are essentially devoid of CRISPR-Cas systems, compared with previous reports of 40% occurrence in bacteria and 81% in archaea.

The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats

It is hoped that availability of a public database, regularly updated and which can be queried on the web will help in further dissecting and understanding CRISPR structure and flanking sequences evolution.

Glacier ice archives nearly 15,000-year-old microbes and phages

Existing clean sampling procedures for glacier ice are expanded using controlled artificial ice-core experiments and previously established low-biomass metagenomic approaches are adapted to study glacier-ice viruses, providing a first window into viral communities and functions in ancient glacier environments.

Prokaryotic virus host predictor: a Gaussian model for host prediction of prokaryotic viruses in metagenomics

The Prokaryotic virus Host Predictor software tool provides an intuitive and user-friendly API for the Gaussian model described herein, which will facilitate the rapid identification of hosts for newly identified proKaryotic viruses in metagenomic studies.

Computational approaches to predict bacteriophage–host relationships

Analysis of 820 phages with annotated hosts shows how current knowledge and insights about the interaction mechanisms and ecology of coevolving phages and bacteria can be exploited to predict phage–host relationships, with potential relevance for medical and industrial applications.

Fast and sensitive protein alignment using DIAMOND

DIAMOND is introduced, an open-source algorithm based on double indexing that is 20,000 times faster than BLASTX on short reads and has a similar degree of sensitivity.

Detecting the hosts of bacteriophages using GCN-based semi-supervised learning

This work presents a semi-supervised learning model, named HostG, to conduct host prediction for novel phages, and demonstrates that it competes favorably against the state-of-the-art pipelines.

DeepHost: phage host prediction with convolutional neural network.

To encode the phage genomes into matrices, a genome encoding method that applied various spaced $k$-mer pairs to tolerate sequence variations, including insertion, deletions, and mutations is designed.