Accelerated similarity searching and clustering of large compound sets by geometric embedding and locality sensitive hashing

@inproceedings{Cao2010AcceleratedSS,
  title={Accelerated similarity searching and clustering of large compound sets by geometric embedding and locality sensitive hashing},
  author={Yiqun Cao and Tao Jiang and Thomas Girke},
  booktitle={Bioinformatics},
  year={2010}
}
MOTIVATION Similarity searching and clustering of chemical compounds by structural similarities are important computational approaches for identifying drug-like small molecules. Most algorithms available for these tasks are limited by their speed and scalability, and cannot handle today's large compound databases with several million entries. RESULTS In this article, we introduce a new algorithm for accelerated similarity searching and clustering of very large compound sets using embedding… CONTINUE READING
Highly Cited
This paper has 29 citations. REVIEW CITATIONS
Related Discussions
This paper has been referenced on Twitter 1 time. VIEW TWEETS

Citations

Publications citing this paper.
Showing 1-10 of 11 extracted citations

Secure Multiset Intersection Cardinality and its Application to Jaccard Coefficient

IEEE Transactions on Dependable and Secure Computing • 2016
View 1 Excerpt

Graph methods for predicting the function of chemical compounds

2014 IEEE International Conference on Granular Computing (GrC) • 2014

Plant Chemical Genomics

Methods in Molecular Biology • 2014

References

Publications referenced by this paper.
Showing 1-10 of 41 references

A self-organizing principle for learning nonlinear manifolds.

Proceedings of the National Academy of Sciences of the United States of America • 2002
View 4 Excerpts
Highly Influenced

An Efficient Implementation of Distance-Based Diversity Measures Based on k-d Trees

Journal of Chemical Information and Computer Sciences • 1999
View 4 Excerpts
Highly Influenced

PubChem Fingerprint for JChem

T. I. Oprea
2009

Similar Papers

Loading similar papers…