Visualisation and subsets of the chemical universe database GDB-13 for virtual screening


The chemical universe database GDB-13, which enumerates 977 million organic molecules up to 13 atoms of C, N, O, S and Cl following simple chemical stability and synthetic feasibility rules, represents a vast reservoir for new fragments. GDB-13 was classified using the MQN-system discussed in the preceding paper for the analysis of PubChem fragments. Two hundred and fifty-five subsets of GDB-13 were generated by the combinatorial use of eight restrictive criteria, including fragment-like ("rule of three") and scaffold-like (no acyclic carbon atoms) filters. Virtual screening for analogs of 15 commercial drugs of 13 non-hydrogen atoms or less shows that retrieving MQN-neighbors of a query molecule from GDB-13 or its subsets provides on average a 38-fold enrichment in structural analogs (Daylight-type substructure fingerprint Tanimoto T (SF) > 0.7), and a 75-fold enrichment in shape-similar analogs (ROCS TanimotoCombo score > 1.4). An MQN-searchable version of GDB-13 is provided at .

DOI: 10.1007/s10822-011-9436-y

Extracted Key Phrases

10 Figures and Tables

Cite this paper

@article{Blum2011VisualisationAS, title={Visualisation and subsets of the chemical universe database GDB-13 for virtual screening}, author={Lorenz C. Blum and Ruud van Deursen and Jean-Louis Reymond}, journal={Journal of computer-aided molecular design}, year={2011}, volume={25 7}, pages={637-47} }