Selecting machine-learning scoring functions for structure-based virtual screening.

@article{Ballester2019SelectingMS,
  title={Selecting machine-learning scoring functions for structure-based virtual screening.},
  author={Pedro J. Ballester},
  journal={Drug discovery today. Technologies},
  year={2019},
  volume={32-33},
  pages={
          81-87
        }
}
  • P. Ballester
  • Published 1 December 2019
  • Computer Science, Medicine
  • Drug discovery today. Technologies
Interest in docking technologies has grown parallel to the ever increasing number and diversity of 3D models for macromolecular therapeutic targets. Structure-Based Virtual Screening (SBVS) aims at leveraging these experimental structures to discover the necessary starting points for the drug discovery process. It is now established that Machine Learning (ML) can strongly enhance the predictive accuracy of scoring functions for SBVS by exploiting large datasets from targets, molecules and their… 
Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning
TLDR
It is claimed that using machine learning (ML) methodologies over the ensemble docking results could improve the predictive power of SBVS, and results indicate that the ML classifiers significantly outperform traditional consensus strategies and even the best performance case achieved with single-structure docking.
Computational representations of protein–ligand interfaces for structure-based virtual screening
TLDR
The authors review the computational methods for representing protein-ligand interfaces, which include the traditional ones that use deliberately designed fingerprints and descriptors and the more recent methods that automatically extract features with deep learning.
The impact of cross-docked poses on performance of machine learning classifier for protein–ligand binding pose prediction
TLDR
This study developed several XGBoost-trained classifiers to discriminate the near-native binding poses from decoys, and systematically assessed their performance with/without the involvement of the cross-docked poses in the training/test sets.
High-throughput virtual laboratory for drug discovery using massive datasets
Time-to-solution for structure-based screening of massive chemical databases for COVID-19 drug discovery has been decreased by an order of magnitude, and a virtual laboratory has been deployed at
Efficient Hit-to-Lead Searching of Kinase Inhibitor Chemical Space via Computational Fragment Merging
TLDR
This work describes a computational strategy focused on kinase inhibitors, intended to expedite the process of identifying analogs with improved potency and facilitates the rapid assembly and screening of increasingly large libraries for focused hit-to-lead optimization.
Efficient Hit-to-Lead Searching of Kinase Inhibitor Chemical Space via Computational Fragment Merging
TLDR
This work describes a computational strategy focused on kinase inhibitors, intended to expedite the process of identifying analogues with improved potency and facilitates the rapid assembly and screening of increasingly large libraries for focused hit-to-lead optimization.
Assigning confidence to molecular property prediction
TLDR
Assessing uncertainty in property prediction models is essential whenever closed-loop drug design campaigns relying on high-throughput virtual screening are deployed, and considering sources of uncertainty leads to better-informed validations, more reliable predictions and more realistic expectations of the entire workflow.
Resources and computational strategies to advance small molecule SARS-CoV-2 discovery: Lessons from the pandemic and preparing for future health crises
TLDR
This mini-review reports several databases and online tools that could assist the discovery of anti-SARS-CoV-2 small chemical compounds and peptides and questions the overall lack of discussion and plan observed in academic research in many countries during this crisis.
Artificial intelligence in drug discovery: recent advances and future perspectives
TLDR
Deep learning-based approaches have only begun to address some fundamental problems in drug discovery, and methodological advances, such as message-passing models, spatial-symmetry-preserving networks, hybrid de novo design, and other innovative machine learning paradigms will likely become commonplace and help address some of the most challenging questions.
Combination of pose and rank consensus in docking-based virtual screening: the best of both worlds
The new methodology named Pose/Ranking Consensus (PRC) combines both pose and ranking consensus strategies. It displays an enhanced performance in terms of enrichment factor and hit rate, ensuring
...
1
2
...

References

SHOWING 1-10 OF 61 REFERENCES
Performance of machine-learning scoring functions in structure-based virtual screening
TLDR
A new ready-to-use scoring function (RF-Score-VS) trained on 15 426 active and 893 897 inactive molecules docked to a set of 102 targets that provides much better prediction of measured binding affinity than Vina.
The impact of compound library size on the performance of scoring functions for structure-based virtual screening.
TLDR
It is found that screening a larger compound library results in more potent actives being identified in all six additional targets using a different docking tool along with its classical SF, and a way to improve the potency of the retrieved molecules further is to rank them with more accurate ML-based SFs.
Machine‐learning scoring functions to improve structure‐based binding affinity prediction and virtual screening
TLDR
The emerging picture from these studies is that the classical approach of using linear regression with a small number of expert‐selected structural features can be strongly improved by a machine‐learning approach based on nonlinear regression allied with comprehensive data‐driven feature selection.
Machine‐learning scoring functions for structure‐based virtual screening
TLDR
This review highlighted the codes and webservers that are available to build or apply machine‐learning scoring functions to prospective structure‐based virtual screening studies, in which mid‐nanomolar binders with novel chemical structures were directly discovered without any potency optimization.
Practical Model Selection for Prospective Virtual Screening
TLDR
This work considers a wide range of ligand-based machine learning and docking-based approaches for virtual screening on two protein–protein interactions, PriA-SSB and RMI-FANCM, and identifies a random forest as the best algorithm for these targets over more sophisticated neural network-based models.
Improved Method of Structure-Based Virtual Screening via Interaction-Energy-Based Learning
TLDR
A new virtual screening method named Similarity of Interaction Energy VEctor Score (SIEVE-Score), in which protein-ligand interaction energies are extracted to represent docking poses for machine learning.
Machine‐learning scoring functions for structure‐based drug lead optimization
TLDR
The performance gap between classical and machine-learning SFs for drug lead optimization in the 2015–2019 period was large and has now broadened owing to methodological improvements and the availability of more training data.
Structure-Based Virtual Screening for Drug Discovery: a Problem-Centric Review
TLDR
The recent advances and applications in SBVS are reviewed with a special focus on docking-based virtual screening and the researchers’ practical efforts in real projects are emphasized by understanding the ligand-target binding interactions as a premise.
Improving structure-based virtual screening performance via learning from scoring function components
TLDR
The EATL approach not only outperforms classical SFs for the absolute performance and initial enrichment (BEDROC) but also yields comparable performance compared with other advanced ML-based methods on the diverse subset of Directory of Useful Decoys: Enhanced (DUD-E).
Machine learning classification can reduce false positives in structure-based virtual screening
TLDR
A strategy for building a training dataset (D-COID) that aims to generate highly compelling decoy complexes that are individually matched to available active complexes that help provide chemical probes for new potential drug targets as they are discovered is reported.
...
1
2
3
4
5
...