• Publications
  • Influence
Algorithms for Mining Distance-Based Outliers in Large Datasets
This paper provides formal and empirical evidence showing the usefulness of DB-outliers and presents two simple algorithms for computing such outliers, both having a complexity of O(k N’), k being the dimensionality and N being the number of objects in the dataset. Expand
Distance-based outliers: algorithms and applications
Outlier detection can be done efficiently for large datasets, and for k-dimensional datasets with large values of k, and it is shown that outlier detection is a meaningful and important knowledge discovery task. Expand
A Unified Notion of Outliers: Properties and Computation
A unified outlier detection system can replace a whole spectrum of statistical discordancy tests with a single module detecting only the kinds of outliers proposed. Expand
A unified approach for mining outliers
The proposed, intuitive notion of outliers can unify or generalize many of the existing notions of outlier provided by discordancy tests for standard statistical distributions, so that when mining large datasets containing many attributes, a unified approach can replace many statistical discordancies tests, regardless of any knowledge about the underlying distribution of the attributes. Expand
Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining
  • E. Knorr, R. Ng
  • Computer Science
  • IEEE Trans. Knowl. Data Eng.
  • 1 December 1996
The main contribution of the paper is the development of Algorithm GenCom (Generalization for Commonality extraction) that makes use of concept generalization to effectively derive many meaningful commonalities that cannot be found otherwise. Expand
Finding Boundary Shape Matching Relationships in Spatial Data
This paper provides an approach for detecting a boundary shape match between the facing curves of the cluster and feature, and shows how to quantify the value of the match. Expand
Robust space transformations for distance-based operations
The fundamental question that this paper addresses is: "What then is an appropriate space?" and this paper proposes using a robust space transformation called the Donoho-Stahel estimator, which says that in spite of frequent updates, the estimator does not lose its usefulness, or require re-computation. Expand
Extraction of Spatial Proximity Patterns by Concept Generalization
Algorithm GenDis is developed which uses concept generalization to identify the distinguishing features or concepts which serve as discriminators and which discriminators are "better" than others by using a ranking system to quantitatively weigh maximal discriminators from different concept hierarchies. Expand