Non-Euclidean norms and data normalisation

Abstract

In this paper, we empirically examine the use of a range of Minkowski norms for the clustering of real world data. We also investigate whether normalisation of the data prior to clustering affects the quality of the result. In a nearest neighbour search on raw real world data sets, fractional norms outperform the Euclidean and higher-order norms. However, when the data are normalised, the results of the nearest neighbour search with the fractional norms are very similar to the results obtained with the Euclidean norm. We show with the classic statistical technique, K-means clustering, and with the Neural Gas artificial neural network that on raw real world data the use of a fractional norm does not improve the recovery of cluster structure. However, the normalisation of the data results in improved recovery accuracy and minimises the effect of the differing norms.

Extracted Key Phrases

3 Figures and Tables

Cite this paper

@inproceedings{Doherty2004NonEuclideanNA, title={Non-Euclidean norms and data normalisation}, author={Kevin Doherty and Rod Adams and Neil Davey}, booktitle={ESANN}, year={2004} }