SVM approach for predicting LogP

  title={SVM approach for predicting LogP},
  author={Quan Liao and Jianhua Yao and Shengang Yuan},
  journal={Molecular Diversity},
SummaryThe logarithm of the partition coefficient between n-octanol and water (logP) is an important parameter for drug discovery. Based upon the comparison of several prediction logP models, i.e. Support Vector Machines (SVM), Partial Least Squares (PLS) and Multiple Linear Regression (MLR), the authors reported SVM model is the best one in this paper. 

Figures, Tables, and Topics from this paper

Three-class classification models of logS and logP derived by using GA–CG–SVM approach
A comparison between the performance of GA–CG–SVM models and that of GA-S VM models shows that the SVM parameter optimization has a significant impact on the quality of SVM classification model.
In Silico Log P Prediction for a Large Data Set with Support Vector Machines, Radial Basis Neural Networks and Multiple Linear Regression
  • Haifeng Chen
  • Mathematics, Medicine
    Chemical biology & drug design
  • 2009
Investigation of the correlation between partition coefficient and physico‐chemical descriptors for a large data set of compounds shows that non‐linear support vector machines derives statistical models that have better prediction ability than those of radial basis function neural networks and multiple linear regression methods.
Binary Classification of Aqueous Solubility Using Support Vector Machines with Reduction and Recombination Feature Selection
The best model demonstrated robust performance in both cross-validation and prediction of two independent test sets, indicating it could be a practical tool to select soluble compounds for screening, purchasing, and synthesizing.
Prediction and interpretation of the lipophilicity of small peptides
Results prove that a PLS model based on VolSurf+ descriptors is the best tool to predictlogDoct of neutral and ionised peptides and the mechanistic interpretation reveals that the inclusion in the chemical structure of a HBD group is more efficient in decreasing lipophilicity than the inclusion of aHBA group.
Prediction of mutagenic toxicity by combination of Recursive Partitioning and Support Vector Machines
Recursive Partitioning (RP) method was used to select descriptors and RP and Support Vector Machines were used to construct structure–toxicity relationship models, RP model and SVM model, respectively.
Large-scale ligand-based predictive modelling using support vector machines
To investigate the effect of dataset sizes on predictive performance and modelling time, ligand-based regression models were trained on open datasets of varying sizes of up to 1.2 million chemical structures and a non-linear kernel proved to be infeasible for large data sizes.
Energy-entropy prediction of octanol–water logP of SAMPL7 N-acyl sulfonamide bioisosters
Test a method in this class called Energy Entropy Multiscale Cell Correlation (EE-MCC) for the calculation of octanol–water logP values for 22 N-acyl sulfonamides in the SAMPL7 Physical Properties Challenge (Statistical Assessment of the Modelling of Proteins and Ligands).
Calculating Partition Coefficients of Small Molecules in Octanol/Water and Cyclohexane/Water.
Results from GAFF and GAFF-DC seem to exhibit systematic biases in opposite directions for calculated cyclohexane/water partition coefficients.
Assessment of the chromatographic lipophilicity of eight cephalosporins on different stationary phases
Synthesis, QSAR Study and Optimization of Propiophenone Oxime Derivatives†
A method combining chemical and biological experiments and computer-aided molecular design was used to optimize lead compounds with inhibiting activity against Sphaerotheca fuliginea. 44


A Universal Molecular Descriptor System for Prediction of LogP, LogS, LogBB, and Absorption
  • Hongmao Sun
  • Chemistry, Medicine
    J. Chem. Inf. Model.
  • 2004
Predictive models for octanol/water partition coefficient (logP), aqueous solubility (logS), blood-brain barrier (logBB), and human intestinal absorption (HIA) were built from a universal, generic
Drug Discovery Using Support Vector Machines. The Case Studies of Drug-likeness, Agrochemical-likeness, and Enzyme Inhibition Predictions
Support Vector Machines is used for estimating the activity of Carbonic Anhydrase II (CA II) enzyme inhibitors and it is found that the prediction quality of the SVM model is better than that reported earlier for conventional QSAR.
Automatic log P estimation based on combined additive modeling methods
SummaryA program for the automatic estimation of the logarithm of the partition coefficient between 1-octanol and water phases (log P) has been developed as a component of a system entitled CHEMICALC
Comparative Study of QSAR/QSPR Correlations Using Support Vector Machines, Radial Basis Function Neural Networks, and Multiple Linear Regression
The results indicate that SVM can be used as an alternative powerful modeling tool for QSAR studies and is comparable or superior to those obtained by MLR and RBFNN.
Prediction of Protein Retention Times in Anion-Exchange Chromatography Systems Using Support Vector Regression
A novel algorithm based on Support Vector Machine (SVM) regression has been employed to obtain predictive QSRR models using a two-step computational strategy that can be used as an automated prediction tool for virtual high-throughput screening (VHTS).
A New Atom-Additive Method for Calculating Partition Coefficients
The result shows that the method for log P estimation is applicable to quantitative structure−activity relationship studies and gives better results than other more complicated atom-additive methods.
Classification of the carcinogenicity of N-nitroso compounds based on support vector machines and linear discriminant analysis.
The support vector machine (SVM), as a novel type of learning machine, was used to develop a classification model of carcinogenic properties of 148 N-nitroso compounds and confirmed the discriminative capacity of the calculated descriptors.
Generalized Fragment-Substructure Based Property Prediction Method
  • M. Clark
  • Medicine, Computer Science
    J. Chem. Inf. Model.
  • 2005
A novel method for making predictive models based on decomposing 2D structure into component structural fragments is used to model logP, water solubility, and melting point, and facilitates understanding of how molecules might be altered to improve the desired properties.
Autocorrelation modeling of lipophilicity with a back-propagation neural network
Abstract From a training set of 7200 chemicals a back-propagation neural network (BNN) model was developed for estimating the n -octanol/water partition coefficient of organic molecules. Chemicals