• Publications
  • Influence
CURE: an efficient clustering algorithm for large databases
Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with sphericalExpand
  • 2,733
  • 216
ROCK: a robust clustering algorithm for categorical attributes
We study clustering algorithms for data with Boolean and categorical attributes. We show that traditional clustering algorithms that use distances between points for clustering are not appropriateExpand
  • 1,485
  • 160
Efficient algorithms for mining outliers from large data sets
In this paper, we propose a novel formulation for distance-based outliers that is based on the distance of a point from its kth nearest neighbor. We rank each point on the basis of its distance toExpand
  • 1,306
  • 92
A cost-based model and effective heuristic for repairing constraints by value modification
Data integrated from multiple sources may contain inconsistencies that violate integrity constraints. The constraint repair problem attempts to find "low cost" changes that, when applied, will causeExpand
  • 376
  • 56
SPIRIT: Sequential Pattern Mining with Regular Expression Constraints
An infrared generator wherein an ellipsoidal reflector has a source rich in infra red at one focus thereof. The end of the reflector at the other focus merges with a paraboloidal reflector positionedExpand
  • 578
  • 48
Graph summarization with bounded error
We propose a highly compact two-part representation of a given graph G consisting of a graph summary and a set of corrections. The graph summary is an aggregate graph in which each node correspondsExpand
  • 291
  • 41
Provisioning a virtual private network: a network design problem for multicommodity flow
Consider a setting in which a group of nodes, situated in a large underlying network, wishes to reserve bandwidth on which to support communication. Virtual private networks (VPNs) are services thatExpand
  • 293
  • 39
Approximate query processing using wavelets
Abstract. Approximate query processing has emerged as a cost-effective approach for dealing with the huge data volumes and stringent response-time requirements of today's decision support systemsExpand
  • 496
  • 32
Algorithms for provisioning virtual private networks in the hose model
Virtual Private Networks (VPNs) provide customers with predictable and secure network connections over a shared network. The recently proposed hose model for VPNs allows for greater flexibility sinceExpand
  • 152
  • 29
Efficient filtering of XML documents with XPath expressions
We propose a novel index structure, termed XTrie, that supports the efficient filtering of XML documents based on XPath expressions. Our XTrie index structure offers several novel features that makeExpand
  • 148
  • 22