Twister Tries: Approximate Hierarchical Agglomerative Clustering for Average Distance in Linear Time

@article{Cochez2015TwisterTA,
  title={Twister Tries: Approximate Hierarchical Agglomerative Clustering for Average Distance in Linear Time},
  author={Michael Cochez and Hao Mou},
  journal={Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data},
  year={2015}
}
  • Michael Cochez, Hao Mou
  • Published 2015
  • Computer Science
  • Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
Many commonly used data-mining techniques utilized across research fields perform poorly when used for large data sets. Sequential agglomerative hierarchical non-overlapping clustering is one technique for which the algorithms' scaling properties prohibit clustering of a large amount of items. Besides the unfavorable time complexity of O(n2), these algorithms have a space complexity of O(n2), which can be reduced to O(n) if the time complexity is allowed to rise to O(n2 log2n). In this paper… Expand
19 Citations
Scalable Hierarchical Clustering: Twister Tries with a Posteriori Trie Elimination
  • 1
  • PDF
Subquadratic High-Dimensional Hierarchical Clustering
  • 6
  • PDF
Fully dynamic hierarchical diameter k-clustering and k-center
  • 1
  • PDF
Reliable Agglomerative Clustering
  • 1
  • PDF
Systematic Review of Clustering High-Dimensional and Large Datasets
  • 13
On Efficient Low Distortion Ultrametric Embedding
  • PDF
Objective-Based Hierarchical Clustering of Deep Embedding Vectors
  • 1
  • PDF
Large Scale Knowledge Matching with Balanced Efficiency-Effectiveness Using LSH Forest
  • 1
  • PDF
Hierarchical Correlation Clustering and Tree Preserving Embedding
  • 2
  • PDF
...
1
2
...

References

SHOWING 1-8 OF 8 REFERENCES
Efficient hierarchical clustering of large high dimensional datasets
  • 12
  • Highly Influential
  • PDF
Distance Based Fast Hierarchical Clustering Method for Large Datasets
  • 15
  • Highly Influential
  • PDF
Fast approximate hierarchical clustering using similarity heuristics
  • 30
  • Highly Influential
  • PDF
Modern hierarchical, agglomerative clustering algorithms
  • 281
  • Highly Influential
LSH forest: self-tuning indexes for similarity search
  • 351
  • Highly Influential
  • PDF
Approximate nearest neighbors: towards removing the curse of dimensionality
  • 3,980
  • Highly Influential
  • PDF
A Method for Comparing Two Hierarchical Clusterings
  • 1,092
  • Highly Influential
  • PDF
On the resemblance and containment of documents
  • A. Broder
  • Mathematics, History
  • Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171)
  • 1997
  • 1,700
  • Highly Influential
  • PDF