Chemical similarity searches using latent semantic structural indexing (LaSSI) and comparison to TOPOSIM.


Similarity searches based on chemical descriptors have proven extremely useful in aiding large-scale drug screening. Here we present results of similarity searching using Latent Semantic Structure Indexing (LaSSI). LaSSI uses a singular value decomposition on chemical descriptors to project molecules into a k-dimensional descriptor space, where k is the number of retained singular values. The effect of the projection is that certain descriptors are emphasized over others and some descriptors may count as partially equivalent to others. We compare LaSSI searches to searches done with TOPOSIM, our standard in-house method, which uses the Dice similarity definition. Standard descriptor-based methods such as TOPOSIM count all descriptors equally and treat all descriptors as independent. For this work we use atom pairs and topological torsions as examples of chemical descriptors. Using objective criteria to determine how effective one similarity method is versus another in selecting active compounds from a large database, we find for a series of 16 drug-like probes that LaSSI is as good as or better than TOPOSIM in selecting active compounds from the MDDR database, if the user is allowed to treat k as an adjustable parameter. Typically, LaSSI selects very different sets of actives than does TOPOSIM, so it can find classes of actives that TOPOSIM would miss.

Cite this paper

@article{Hull2001ChemicalSS, title={Chemical similarity searches using latent semantic structural indexing (LaSSI) and comparison to TOPOSIM.}, author={Richard D. Hull and Eugene M. Fluder and Suresh B. Singh and Robert B. Nachbar and Simon K. Kearsley and Robert P. Sheridan}, journal={Journal of medicinal chemistry}, year={2001}, volume={44 8}, pages={1185-91} }