# Non-parametric detection of meaningless distances in high dimensional data

@article{Kabn2012NonparametricDO, title={Non-parametric detection of meaningless distances in high dimensional data}, author={A. Kab{\'a}n}, journal={Statistics and Computing}, year={2012}, volume={22}, pages={375-385} }

Distance concentration is the phenomenon that, in certain conditions, the contrast between the nearest and the farthest neighbouring points vanishes as the data dimensionality increases. It affects high dimensional data processing, analysis, retrieval, and indexing, which all rely on some notion of distance or dissimilarity. Previous work has characterised this phenomenon in the limit of infinite dimensions. However, real data is finite dimensional, and hence the infinite-dimensional… Expand

#### 48 Citations

Multiplicative distance: a method to alleviate distance instability for high-dimensional data

- Mathematics, Computer Science
- Knowledge and Information Systems
- 2014

On the Behavior of Intrinsically High-Dimensional Spaces: Distances, Direct and Reverse Nearest Neighbors, and Hubness

- Mathematics, Computer Science
- J. Mach. Learn. Res.
- 2017

The Role of Hubness in Clustering High-Dimensional Data

- Mathematics, Computer Science
- IEEE Transactions on Knowledge and Data Engineering
- 2014

The Role of Hubness in Clustering High-Dimensional Data

- Computer Science
- IEEE Trans. Knowl. Data Eng.
- 2014

Instability results for Euclidean distance, nearest neighbor search on high dimensional Gaussian data

- Computer Science
- Inf. Process. Lett.
- 2021

Out-of-Sample Error Estimation: The Blessing of High Dimensionality

- Computer Science
- 2014 IEEE International Conference on Data Mining Workshop
- 2014

Choosing ℓp norms in high-dimensional spaces based on hub analysis

- Computer Science, Medicine
- Neurocomputing
- 2015

#### References

SHOWING 1-10 OF 14 REFERENCES

On the Surprising Behavior of Distance Metrics in High Dimensional Spaces

- Computer Science
- ICDT
- 2001

The Concentration of Fractional Distances

- Mathematics, Computer Science
- IEEE Transactions on Knowledge and Data Engineering
- 2007

When is 'nearest neighbour' meaningful: A converse theorem and implications

- Mathematics, Computer Science
- J. Complex.
- 2009

On the Design and Applicability of Distance Functions in High-Dimensional Data Space

- Mathematics, Computer Science
- IEEE Trans. Knowl. Data Eng.
- 2009

Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2010

On the distance concentration awareness of certain data reduction techniques

- Mathematics, Computer Science
- Pattern Recognit.
- 2011

New instability results for high-dimensional nearest neighbor search

- Mathematics, Computer Science
- Inf. Process. Lett.
- 2009

Classification of Anti-learnable Biological and Synthetic Data

- Mathematics, Computer Science
- PKDD
- 2007