Clustering a large number of compounds. 3. The limits of classification


Clustering is normally used to group items that are similar. In this application of obtaining a diverse sample from the 230,000 compounds in the National Cancer Institute Repository, we cluster to select compounds that are different from the rest, to optimize screening for new leads. With these constraints, our approach yielded many singleton clusters. We can interpret these results as evidence for a limit to classification, contrary to the customary view of chemistry as a study of classes of compounds.

DOI: 10.1021/ci00002a023

Cite this paper

@article{Hodes1991ClusteringAL, title={Clustering a large number of compounds. 3. The limits of classification}, author={Louis Hodes and Alfred Feldman}, journal={Journal of chemical information and computer sciences}, year={1991}, volume={31 2}, pages={347-50} }