Spatial categorical outlier detection: pair correlation function based approach


Spatial Categorical Outlier Detection (SCOD) has attracted considerable attentions from the areas of spatial data mining and geological analysis. When encountering an SCOD problem, some researchers introduce to utilize Spatial Numerical Outlier Detection measures by mapping categorical attributes to continuous ones. However, such approaches fail to capture the special properties of spatial categorical data, which is prone to incur the masking and swamping issues. In this paper, we model spatial dependencies between spatial categorical observations and propose a Pair Correlation Function(PCF) based method to detect SCOs. First, a new metric, named Pair Correlation Ratio(PCR), is estimated for each pair of categorical combinations based on their co-occurrence frequency at different spatial distances. Then discrete PCRs are fitted in a continuous function of distances. The outlier score is computed using the average PCRs between referenced object and its spatial neighbors. Observations with the lowest PCRs are labeled as potential SCOs. Extensive experiments demonstrated that PCF based method outperformed existing approaches.

DOI: 10.1145/2093973.2094049

Extracted Key Phrases

3 Figures and Tables

Cite this paper

@inproceedings{Liu2011SpatialCO, title={Spatial categorical outlier detection: pair correlation function based approach}, author={Xutong Liu and Feng Chen and Chang-Tien Lu}, booktitle={GIS}, year={2011} }