Non-parametric detection of meaningless distances in high dimensional data

@article{Kabn2012NonparametricDO,
  title={Non-parametric detection of meaningless distances in high dimensional data},
  author={A. Kab{\'a}n},
  journal={Statistics and Computing},
  year={2012},
  volume={22},
  pages={375-385}
}
  • A. Kabán
  • Published 2012
  • Mathematics, Computer Science
  • Statistics and Computing
Distance concentration is the phenomenon that, in certain conditions, the contrast between the nearest and the farthest neighbouring points vanishes as the data dimensionality increases. It affects high dimensional data processing, analysis, retrieval, and indexing, which all rely on some notion of distance or dissimilarity. Previous work has characterised this phenomenon in the limit of infinite dimensions. However, real data is finite dimensional, and hence the infinite-dimensional… Expand
Multiplicative distance: a method to alleviate distance instability for high-dimensional data
On the Behavior of Intrinsically High-Dimensional Spaces: Distances, Direct and Reverse Nearest Neighbors, and Hubness
  • F. Angiulli
  • Mathematics, Computer Science
  • J. Mach. Learn. Res.
  • 2017
K-Means Based Clustering In High Dimensional Data
The Role of Hubness in Clustering High-Dimensional Data
The Role of Hubness in Clustering High-Dimensional Data
Out-of-Sample Error Estimation: The Blessing of High Dimensionality
Choosing ℓp norms in high-dimensional spaces based on hub analysis
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 14 REFERENCES
On the Surprising Behavior of Distance Metrics in High Dimensional Spaces
The Concentration of Fractional Distances
When is 'nearest neighbour' meaningful: A converse theorem and implications
On the Design and Applicability of Distance Functions in High-Dimensional Data Space
Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data
When Is ''Nearest Neighbor'' Meaningful?
On the distance concentration awareness of certain data reduction techniques
  • A. Kabán
  • Mathematics, Computer Science
  • Pattern Recognit.
  • 2011
New instability results for high-dimensional nearest neighbor search
  • C. Giannella
  • Mathematics, Computer Science
  • Inf. Process. Lett.
  • 2009
Measure Concentration of Strongly Mixing Processes with Applications
Classification of Anti-learnable Biological and Synthetic Data
...
1
2
...