Geodesic distances in high-dimensional spaces, the Sneodesic distance overcomes the curse of dimensionality
Files
Lebrun_13101900_2024.pdf
Embargoed access until 2025-07-01 - Adobe PDF
- 30.52 MB
Details
- Supervisors
- Degree label
- Abstract
- In data analysis, the choice of distance metrics is paramount, especially in high-dimensional spaces where traditional measures like Euclidean distance become less effective due to the curse of dimensionality. This thesis introduces the sneodesic distance, a novel metric designed to address this challenge. The term `sneodesic' is a contraction of `SNE' (Stochastic Neighbor Embedding) and `geodesic', reflecting its foundation. By incorporating SNE similarities with a small perplexity into geodesic distances, the sneodesic distance maintains meaningful relationships between data points and accounts for both the local and global structure of the dataset. This new metric proves beneficial in several machine learning applications, and shows situational improvements in dimensionality reduction for better data visualization. Theoretical explorations and empirical evaluations demonstrate the robustness and broad applicability of the sneodesic distance, making it a valuable tool for high-dimensional data analysis.