Learn More
This paper explores the utility of data mining and machine learning algorithms for the induction of mutagenicity structure-activity relationships (SARs) from noncongeneric data sets. We compare (i) a newly developed algorithm (MOLFEA) for the generation of descriptors (molecular fragments) for noncongeneric compounds with traditional SAR approaches(More)
We propose a new kernel, based on 2-D structural chemical similarity, that integrates activity-specific information from the training data, and a new approach to applicability domain estimation that takes feature significances and activity distributions into consideration. The new kernel provides superior results than the well-established Tanimoto kernel,(More)
Motivation: The Predictive Toxicology Challenge (PTC) was initiated to stimulate the development of advanced techniques for predictive toxicology models. The goal of this challenge was to compare different approaches for the prediction of rodent carcinogenicity, based on the experimental results of the US National Toxicology Program (NTP). Results: 111 sets(More)
MOTIVATION The development of in silico models to predict chemical carcinogenesis from molecular structure would help greatly to prevent environmentally caused cancers. The Predictive Toxicology Challenge (PTC) competition was organized to test the state-of-the-art in applying machine learning to form such predictive models. RESULTS Fourteen machine(More)
Intestinal drug absorption in humans is a central topic in drug discovery. In this study, we use a broad selection of machine learning and statistical methods for the classification and numerical prediction of this key end point. Our data set is based on a selection of 458 small druglike compounds with FDA approval. Using easily available tools, we(More)
We initiated the Predictive Toxicology Challenge (PTC) to stimulate the development of advanced SAR techniques for predictive toxicology models. The goal of this challenge is to predict the rodent carcinogenicity of new compounds based on the experimental results of the US National Toxicology Program (NTP). Submissions will be evaluated on quantitative and(More)
It is a well-known fact that propositional learning algorithms require \good" features to perform well in practice. So a major step in data engineering for inductive learning is the construction of good features by domain experts. These features often represent properties of structured objects, where a property typically is the occurrence of a certain(More)
MOTIVATION The Predictive Toxicology Challenge (PTC) was initiated to stimulate the development of advanced techniques for predictive toxicology models. The goal of this challenge was to compare different approaches for the prediction of rodent carcinogenicity, based on the experimental results of the US National Toxicology Program (NTP). RESULTS 111 sets(More)
Mutagenicity and carcinogenicity are endpoints of major environmental and regulatory concern. These endpoints are also important targets for development of alternative methods for screening and prediction due to the large number of chemicals of potential concern and the tremendous cost (in time, money, animals) of rodent carcinogenicity bioassays. Both(More)