Learn More
Discovering the unintended 'off-targets' that predict adverse drug reactions is daunting by empirical methods alone. Drugs can act on several protein targets, some of which can be unrelated by conventional molecular metrics, and hundreds of proteins have been implicated in side effects. Here we use a computational strategy to predict the activity of 656(More)
Arabidopsis thaliana is an important model system for plant biologists. In 1996 an international collaboration (the Arabidopsis Genome Initiative) was formed to sequence the whole genome of Arabidopsis and in 1999 the sequence of the first two chromosomes was reported. The sequence of the last three chromosomes and an analysis of the whole genome are(More)
Conventional similarity searching of molecules compares single (or multiple) active query structures to each other in a relative framework, by means of a structural descriptor and a similarity measure. While this often works well, depending on the target, we show here that retrieval rates can be improved considerably by incorporating an external framework(More)
We present a workflow that leverages data from chemogenomics based target predictions with Systems Biology databases to better understand off-target related toxicities. By analyzing a set of compounds that share a common toxic phenotype and by comparing the pathways they affect with pathways modulated by nontoxic compounds we are able to establish links(More)
Target identification is a critical step following the discovery of small molecules that elicit a biological phenotype. The present work seeks to provide an in silico correlate of experimental target fishing technologies in order to rapidly fish out potential targets for compounds on the basis of chemical structure alone. A multiple-category(More)
Different molecular descriptors capture different aspects of molecular structures, but this effect has not yet been quantified systematically on a large scale. In this work, we calculate the similarity of 37 descriptors by repeatedly selecting query compounds and ranking the rest of the database. Euclidean distances between the rank-ordering of different(More)
We compared two algorithms for ligand-target prediction, namely, the Laplacian-modified Bayesian classifier and the Winnow algorithm. A dataset derived from the WOMBAT database, spanning 20 pharmaceutically relevant activity classes with 13 000 compounds, was used for performance assessment in 24 different experiments, each of which was assessed using a(More)
In silico target fishing is an emerging technology that enables the prediction of biological targets of compounds on the basis of chemical structure by using information from increasingly available biologically annotated chemical databases. We provide a comparative review of recent studies in which data mining, similarity, or docking of chemical structures(More)
High-throughput screening (HTS) plays a pivotal role in lead discovery for the pharmaceutical industry. In tandem, cheminformatics approaches are employed to increase the probability of the identification of novel biologically active compounds by mining the HTS data. HTS data is notoriously noisy, and therefore, the selection of the optimal data mining(More)
High throughput screening (HTS) data is often noisy, containing both false positives and negatives. Thus, careful triaging and prioritization of the primary hit list can save time and money by identifying potential false positives before incurring the expense of followup. Of particular concern are cell-based reporter gene assays (RGAs) where the number of(More)