• Publications
  • Influence
CURE: an efficient clustering algorithm for large databases
We propose a new clustering algorithm called CURE that is more robust to outliers, and identifies clusters having non-spherical shapes and wide variances in size. Expand
  • 2,772
  • 216
  • PDF
ROCK: a robust clustering algorithm for categorical attributes
We study clustering algorithms for data with Boolean and categorical attributes. Expand
  • 1,502
  • 161
  • PDF
Efficient algorithms for mining outliers from large data sets
In this paper, we propose a novel formulation for distance-based outliers that is based on the distance of a point from its kth nearest neighbor. Expand
  • 1,334
  • 98
  • PDF
A cost-based model and effective heuristic for repairing constraints by value modification
We define a database repair as a set of value modifications. Expand
  • 382
  • 56
  • PDF
SPIRIT: Sequential Pattern Mining with Regular Expression Constraints
An infrared generator wherein an ellipsoidal reflector has a source rich in infra red at one focus thereof. Expand
  • 587
  • 48
  • PDF
Graph summarization with bounded error
We propose a highly compact two-part representation of a given graph G consisting of a graph summary and a set of corrections, and use the summaries (along with corrections) to compress graphs with bounded error. Expand
  • 298
  • 41
  • PDF
Provisioning a virtual private network: a network design problem for multicommodity flow
We provide optimal and approximate algorithms for several variants of this general network design problem, depending on whether the traffic matrix is required to be symmetric, and we extend their results by providing a stronger approximation algorithm for this latter problem. Expand
  • 295
  • 39
  • PDF
Approximate query processing using wavelets
In this paper, we propose the use of multi-dimensional wavelets as an effective tool for general-purpose approximate query processing in modern, high-dimensional applications. Expand
  • 502
  • 33
  • PDF
Algorithms for provisioning virtual private networks in the hose model
We present a novel integer programming formulation for the general VPN tree computation problem (that is, when ingress and egress bandwidths of VPN endpoints are arbitrary) and develop an algorithm that is based on the primal-dual method. Expand
  • 154
  • 29
Efficient filtering of XML documents with XPath expressions
We propose a novel index structure, termed XTrie, that supports the efficient filtering of XML documents based on XPath expressions that supports both ordered and unordered matching of XML data. Expand
  • 148
  • 22