模型维度与 k 最近邻性能之间关系背后的直觉是什么？

What is the intuition behind the relationship between the dimensions of a model and the performance of k-nearest neighbors?

关于 k 最近邻的属性，在 Elements of Statistical Learning 的第 38 页，作者写道：

“...随着维度 p 变大，k-最近邻域的度量大小也会变大。因此，将最近邻域作为调节的替代品会让我们悲惨地失败。”

这是否意味着，在保持 k 不变的情况下，当我们向模型添加特征时，结果之间的距离会增加，因此邻域的大小也会增加，因此模型的方差会增加？

curse of dimensionality comes in various shapes. Especially for machine learning, there is a discussion here.

一般来说，随着维度的增加，点与点之间的距离相对差异会越来越小。对于 d=1000 维，随机数据集中的任何点 A 比任何其他点更接近给定点 B 的可能性很小。在某种程度上，这可以通过说 d=1000 来解释，在绝大多数维度上，点 A 不太可能靠近点 B（至少不太可能比任何其他任意点更近）。

另一个方面是体积属性随着 'd' 的增加变得不直观。例如，即使假设 d=25 相对适中（如果我没记错的话），单位立方体（边长 = 1）的体积也比单位球体（直径 = 的球体）的体积大 1,000,000 1).我提到这个是因为你的引述提到了 'metric size'，但我不确定这对 kNN 有何影响。

模型维度与 k 最近邻性能之间关系背后的直觉是什么？

What is the intuition behind the relationship between the dimensions of a model and the performance of k-nearest neighbors?

performance

dimensions

knn