High-throughput virtual screening of small molecule inhibitors for SARS-CoV-2 protein targets with deep fusion models

  title={High-throughput virtual screening of small molecule inhibitors for SARS-CoV-2 protein targets with deep fusion models},
  author={Garrett Stevenson and Derek Jones and Hyojin Kim and W. F. Drew Bennett and Brian J. Bennion and Monica K. Borucki and Feliza A. Bourguet and Aidan T Epstein and Magdalena Franco and Brooke Harmon and Stewart He and Max P. Katz and Daniel A. Kirshner and Victoria Lao and Edmond Y. Lau and Jacky Kai-Yin Lo and Kevin S. McLoughlin and Richard A Mosesso and Deepa K. Murugesh and Oscar A. Negrete and Edwin A. Saada and Brent W. Segelke and Maxwell A. Stefan and Marisa W. Torres and Dina R. Weilhammer and Sergio Ernesto Wong and Yue Yang and Adam T. Zemla and Xiaohua Zhang and Fangqiang Zhu and Felice C. Lightstone and Jonathan E. Allen},
  journal={Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis},
Structure-based Deep Fusion models were recently shown to outperform several physics- and machine learning-based protein-ligand binding affinity prediction methods. As part of a multi-institutional COVID-19 pandemic response, over 500 million small molecules were computationally screened against four protein structures from the novel coronavirus (SARS-CoV-2), which causes COVID-19. Three enhancements to Deep Fusion were made in order to evaluate more than 5 billion docked poses on SARS-CoV-2… 

Stepping Back to SMILES Transformers for Fast Molecular Representation Inference

ST-KD is proposed, an end-to-end SMILES Transformer for molecular representation learning boosted by Knowledge Distillation, which shows competitive results on latest standard molecular datasets PCQM4M-LSC and QM9, with 3-14× inference speed compared with existing graph models.

Deep Molecular Representation Learning via Fusing Physical and Chemical Information

PhysChem, a novel neural architecture that learns molecular representations via fusing physical and chemical information of molecules, achieved state-of-the-art performances on MoleculeNet, a standard molecular machine learning benchmark.

Small Molecules Targeting SARS-CoV-2 Spike Glycoprotein Receptor-Binding Domain

It is hypothesized that small molecules could disrupt the interaction of S glycoprotein with hACE2 and inhibit viral entry, and lead to an effective antiviral treatment or serve as probes to better understand the biology of SARS-CoV-2.

Accelerators for Classical Molecular Dynamics Simulations of Biomolecules

The goal is to summarize the fundamental algorithms that are employed in the literature to then highlight the challenges that have affected accelerator implementations in practice, and provide insights into the potential of emerging hardware platforms and algorithms for MD.

A Computational Pipeline to Identify and Characterize Binding Sites and Interacting Chemotypes in SARS-CoV-2

A unique, computational pipeline for the rapid identification and characterization of binding sites in the proteins of novel viruses as well as the core chemical components with which these sites interact is introduced.



Improved Protein-ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference

Fusion models that combine features and inference from complementary representations to improve binding affinity prediction are presented, showing that the fusion models make more accurate predictions than their constituent neural network models as well as docking scoring and MM/GBSA rescoring, with the benefit of greater computational efficiency.

Discovery of Small-Molecule Inhibitors of SARS-CoV-2 Proteins Using a Computational and Experimental Pipeline

This work has employed molecular docking, molecular dynamics simulations, and machine learning to identify from a library of 26 million molecules possible candidate compounds that may attenuate or neutralize the effects of this virus.

KDEEP: Protein-Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks

This work proposes here a fast machine-learning approach for predicting binding affinities using state-of-the-art 3D-convolutional neural networks and compares this approach to other machine- learning and scoring methods using several diverse data sets.

Pafnucy - A deep neural network for structure-based drug discovery

A new deep neural network tailored to estimating the binding affinity of ligand-receptor complexes is developed, which was tested on the CASF "scoring power" benchmark and Astex diverse set and outperformed classical scoring functions.

PotentialNet for Molecular Property Prediction

The PotentialNet family of graph convolutions, specifically designed for and achieve state-of-the-art performance for protein–ligand binding affinity, is presented and a cross-validation strategy based on structural homology clustering is introduced that can more accurately measure model generalizability, which crucially distinguishes the aims of machine learning for drug discovery from standard machine learning tasks.

Protein-Ligand Scoring with Convolutional Neural Networks

This work describes convolutional neural network scoring functions that take as input a comprehensive three-dimensional representation of a protein-ligand interaction and finds that the CNN scoring function outperforms the AutoDock Vina scoring function when ranking poses both for pose prediction and virtual screening.

Machine‐learning scoring functions to improve structure‐based binding affinity prediction and virtual screening

The emerging picture from these studies is that the classical approach of using linear regression with a small number of expert‐selected structural features can be strongly improved by a machine‐learning approach based on nonlinear regression allied with comprehensive data‐driven feature selection.

A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking

A novel scoring function (RF-Score) that circumvents the need for problematic modelling assumptions via non-parametric machine learning is proposed and Random Forest was used to implicitly capture binding effects that are hard to model explicitly.

Atomic Convolutional Networks for Predicting Protein-Ligand Binding Affinity

A general 3-dimensional spatial convolution operation for learning atomic-level chemical interactions directly from atomic coordinates and offers a strong foundation for future improvements in structure-based bioactivity prediction.

DeepBindRG: a deep learning based method for estimating effective protein–ligand affinity

A new deep neural network-based model named DeepBindRG is proposed to predict the binding affinity of protein–ligand complex, which learns all the effects, binding mode, and specificity implicitly implicitly by learning protein-ligand interface contact information from a large protein—ligand dataset.