如何计算两个向量的归一化欧氏距离?
How to calculate normalized euclidean distance on two vectors?
假设我有以下两个向量:
x = [(10-1).*rand(7,1) + 1; randi(10,1,1)];
y = [(10-1).*rand(7,1) + 1; randi(10,1,1)];
前七个元素是[1,10]范围内的连续值。最后一个元素是 [1,10].
范围内的整数
现在我想计算 x 和 y 之间的欧氏距离。我认为整数元素是一个问题,因为所有其他元素都可以非常接近,但整数元素始终具有间距。所以偏向整数元素。
我如何计算它的归一化欧几里得距离之类的东西?
根据Wolfram Alpha, and the following answer from cross validated,归一化欧氏距离定义为:
您可以使用 MATLAB 计算它:
0.5*(std(x-y)^2) / (std(x)^2+std(y)^2)
或者,您可以使用:
0.5*((norm((x-mean(x))-(y-mean(y)))^2)/(norm(x-mean(x))^2+norm(y-mean(y))^2))
我宁愿在计算距离之前对 x 和 y 进行归一化,然后香草欧几里德就足够了。
在你的例子中
x_norm = (x -1) / 9; % normalised x
y_norm = (y -1) / 9; % normalised y
dist = norm(x_norm - y_norm); % Euclidean distance between normalised x, y
但是,我不确定整数元素是否会导致某种偏差,但我们已经得到了一些关于堆栈溢出的题外话:)
来自Euclidean Distance - raw, normalized and double‐scaled coefficients
SYSTAT, Primer 5, and SPSS provide Normalization options for the data so as to permit an investigator to compute a distance
coefficient which is essentially “scale free”. Systat 10.2’s
normalised Euclidean distance produces its “normalisation” by dividing
each squared discrepancy between attributes or persons by the total
number of squared discrepancies (or sample size).
Frankly, I can see little point in this standardization – as the final
coefficient still remains scale‐sensitive. That is, it is impossible
to know whether the value indicates high or low dissimilarity from the
coefficient value alone
假设我有以下两个向量:
x = [(10-1).*rand(7,1) + 1; randi(10,1,1)];
y = [(10-1).*rand(7,1) + 1; randi(10,1,1)];
前七个元素是[1,10]范围内的连续值。最后一个元素是 [1,10].
范围内的整数现在我想计算 x 和 y 之间的欧氏距离。我认为整数元素是一个问题,因为所有其他元素都可以非常接近,但整数元素始终具有间距。所以偏向整数元素。
我如何计算它的归一化欧几里得距离之类的东西?
根据Wolfram Alpha, and the following answer from cross validated,归一化欧氏距离定义为:
您可以使用 MATLAB 计算它:
0.5*(std(x-y)^2) / (std(x)^2+std(y)^2)
或者,您可以使用:
0.5*((norm((x-mean(x))-(y-mean(y)))^2)/(norm(x-mean(x))^2+norm(y-mean(y))^2))
我宁愿在计算距离之前对 x 和 y 进行归一化,然后香草欧几里德就足够了。
在你的例子中
x_norm = (x -1) / 9; % normalised x
y_norm = (y -1) / 9; % normalised y
dist = norm(x_norm - y_norm); % Euclidean distance between normalised x, y
但是,我不确定整数元素是否会导致某种偏差,但我们已经得到了一些关于堆栈溢出的题外话:)
来自Euclidean Distance - raw, normalized and double‐scaled coefficients
SYSTAT, Primer 5, and SPSS provide Normalization options for the data so as to permit an investigator to compute a distance coefficient which is essentially “scale free”. Systat 10.2’s normalised Euclidean distance produces its “normalisation” by dividing each squared discrepancy between attributes or persons by the total number of squared discrepancies (or sample size).
Frankly, I can see little point in this standardization – as the final coefficient still remains scale‐sensitive. That is, it is impossible to know whether the value indicates high or low dissimilarity from the coefficient value alone