是否有 faster/compact 方法从 squareform 获取索引? (Matlab)

Is there a faster/compact way of obtaining the indices from squareform? (Matlab)

各位。 我有一个名为 "data" 的 3 维数据点矩阵,其维度为 N*3。现在,我正在尝试获取两个值:

首先,距离矩阵"Dist"的索引"m"和"n",其中

Dist = squareform(pdist(data));

这样

[m,n] = find( Dist<=rc & Dist>0 );

其中"rc"是某个截止距离,"m"是行索引,"n"是列索引。

二、条件距离"ConDist",其中

ConDist = data( pdist(data)<=rc & pdist(data)>0 );

此代码适用于小型 "data"(其中 N < 3500),但是,对于大型 "data"(N > 25000),此过程需要太多 time/memory。 因此,我尝试通过执行以下操作来最小化 time/memory:

Dist = zeros(size(data,1));
Dist(tril(true(size(data,1)),-1)) = pdist(data);
[m,n] = find(Dist <= rc  &  Dist > 0);
ConDist = Dist(Dist <= rc  &  Dist > 0);

这里,我只计算了"squareform"命令的下三角边,以减少计算时间(或内存,不知道MATLAB怎么会觉得这段代码简单多了)。但是,似乎仍然需要很多time/memory来计算"Dist"变量

是否有 faster/less-memory-consuming 方法来计算 "m"、"n" 和 "ConDist"? 非常感谢您。

这可能是一种方法 -

N = size(data,1); %// datasize

%// Store transpose of data, as we need to use later on at several places
data_t = data.'  %//'

%// Calculate squared distances with matrix multiplication based technique
sqdist = tril([-2*data data.^2 ones(N,3)]*[data_t ; ones(3,N) ; data_t.^2])

%// Logical array with size of distance array and ones that are above threshold
mask_dists = sqdist <= rc^2  &  sqdist > 0

%// Indices & distances from distances array that satisfy thresholding criteria
[m,n] = find(mask_dists)
ConDist = sqrt(sqdist(mask_dists))

你可以在此处引入 bsxfun 来替换 tril(保持其余部分不变),看看是否可以进一步加快速度 -

sqdist = [-2*data data.^2 ones(N,3)]*[data_t ; ones(3,N) ; data_t.^2]
mask_dists = sqdist <= rc^2  &  sqdist > 0 & bsxfun(@ge,[1:N]',1:N)