In order to choose correctly the dimension of calibration model in chemistry, a new simple and effective method named Ž. Monte Carlo cross validation MCCV is introduced in the present work. Unlike leave-one-out procedure commonly used in Ž. chemometrics for cross validation CV , the Monte Carlo cross validation developed in this paper is an asymptotically(More)
BACKGROUND Many studies have shown a consistent association between ambient air pollution and an increase in death due to cardiovascular causes. An increase in blood pressure is a common risk factor for a variety of cardiovascular diseases. However, the association between air pollution and blood pressure has not been evaluated extensively. METHODS In(More)
a To build a credible model for given chemical or biological or clinical data, it may be helpful to first get somewhat better insight into the data itself before modeling and then to present the statistically stable results derived from a large number of sub-models established only on one dataset with the aid of Monte Carlo Sampling (MCS). In the present(More)
The rapidly increasing amount of publicly available data in biology and chemistry enables researchers to revisit interaction problems by systematic integration and analysis of heterogeneous data. Herein, we developed a comprehensive python package to emphasize the integration of chemoinformatics and bioinformatics into a molecular informatics platform for(More)
SUMMARY Sequence-derived structural and physiochemical features have been frequently used for analysing and predicting structural, functional, expression and interaction profiles of proteins and peptides. To facilitate extensive studies of proteins and peptides, we developed a freely available, open source python package called protein in python (propy) for(More)
Chitosan oligosaccharides (COS) have been reported to exert many biological activities, such as antioxidant, antitumor and anti-inflammatory effects. In the present study, we examined the effect of COS on nitric oxide (NO) production in LPS induced N9 microglial cells. Pretreatment with COS (50 ~ 200 μg/ml) could markedly inhibit NO production by(More)
MOTIVATION Molecular representation for small molecules has been routinely used in QSAR/SAR, virtual screening, database search, ranking, drug ADME/T prediction and other drug discovery processes. To facilitate extensive studies of drug molecules, we developed a freely available, open-source python package called chemoinformatics in python (ChemoPy) for(More)
a r t i c l e i n f o The idea of boosting deeply roots in our daily life practice, which constructs the general aspects of how to think about chemical problems and how to build chemical models. In mathematics, boosting is an iterative reweighting procedure by sequentially applying a base learner to reweighted versions of the training data whose current(More)
a r t i c l e i n f o In the structure–activity relationship (SAR) study, a learning algorithm is usually faced with the problem of selecting a compact subset of descriptors related to the property of interest, while ignoring the rest. This paper presents a new method of molecular descriptor selection utilizing three commonly used decision tree (DT)-based(More)