A Generic Query-Based Model for Scalable Clustering

  title={A Generic Query-Based Model for Scalable Clustering},
  author={Michael E. Houle},
This paper presents a generic model for clustering that requires no direct knowledge of the nature or representation of the data. In lieu of such knowledge, the relevant-set clustering(RSC) model relies solely on the existence of an oracle that accepts a query in the form of a data item, and returns a ranked set of items relevant to the query. In principle, the role of the oracle could be played by any similarity search structure, or even a commercial search engine whose ranking function and… CONTINUE READING