Accurate ADMET Prediction with XGBoost

@article{Tian2022AccurateAP,
  title={Accurate ADMET Prediction with XGBoost},
  author={Hao Tian and Rajas Ketkar and Peng Tao},
  journal={ArXiv},
  year={2022},
  volume={abs/2204.07532}
}
The absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties are important in drug discovery as they define efficacy and safety. Here, we apply an ensemble of features, including fingerprints and descriptors, and a tree-based machine learning model, extreme gradient boosting, for accurate ADMET prediction. Our model performs well in the Therapeutics Data Commons ADMET benchmark group. For 22 tasks, our model is ranked first in 10 tasks and top 3 in 18 tasks. 

Figures and Tables from this paper

References

SHOWING 1-10 OF 32 REFERENCES

admetSAR 2.0: web‐service for prediction and optimization of chemical ADMET properties

TLDR
This update of admetSAR, developed as a comprehensive source and free tool for the prediction of chemical ADMET properties, focuses on extension and optimization of existing models with significant quantity and quality improvement on training data.

FP-ADMET: a compendium of fingerprint-based ADMET prediction models

TLDR
It is found that for a majority of the properties, fingerprint-based random forest models yield comparable or better performance compared with traditional 2D/3D molecular descriptors.

vNN Web Server for ADMET Predictions

TLDR
The vNN method is used to develop 15 absorption, distribution, metabolism, excretion, and toxicity (ADMET) prediction models that quickly assess some of the most important properties of potential drug candidates, including their cytotoxicity, mutagenicity, cardiotoxicity, drug-drug interactions, microsomal stability, and likelihood of causing drug-induced liver injury.

PASSer: prediction of allosteric sites server

TLDR
An ensemble learning method, consisting of eXtreme gradient boosting and graph convolutional neural network, to predict allosteric sites, which can learn physical properties and topology without any prior information, shows good performance under multiple indicators.

Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development

TLDR
Therapeutics Data Commons is introduced, the first unifying platform to systematically access and evaluate machine learning across the entire range of therapeutics, and it is envisioned that TDC can facilitate algorithmic and scientific advances and considerably accelerate machinelearning model development, validation and transition into biomedical and clinical implementation.

XGraphBoost: Extracting Graph Neural Network-Based Features for a Better Prediction of Molecular Properties

TLDR
The integrated framework XGraphBoost is proposed to extract the features using a GNN and build an accurate prediction model of molecular properties using the classifier XGBoost and the experimental results strongly suggest that X graphBoost may facilitate the efficient and accurate predictions of various molecular properties.

XGBoost: A Scalable Tree Boosting System

TLDR
This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.

Reoptimization of MDL Keys for Use in Drug Discovery

TLDR
Improvements in the performance of MDL keysets which are reoptimized for use in molecular similarity are reported on and the use of genetic algorithms in the selection of optimal keysets is explored.