如何测量排列总体的方差？

Question

我需要计算排列总体（数组）的方差，即

假设我有这个排列数组：

import numpy as np
import scipy.stats as stats


a = np.matrix([[1,2,3,4,5,6], [2,3,4,6,1,5], [6,3,1,2,5,4]])

# distance between a[0] and a[1]
distance = stats.kendalltau(a[0], a[1])[0]

那么，如何计算（在 Python 中）这个数组的方差，即如何测量这些排列彼此之间的距离？

此致

艾默里克

p.s：我用 kendalltau 度量定义两个排列之间的距离

Answer 1

我不确定这是否是您要查找的数学结果。您可以使用 stats.kendalltau 计算所有可能对的距离，然后从该结果向量中获取方差。

为了获得距离向量，我使用 np.roll:

遍历压缩列表 (a, a-shifted)

dist = []
for x1, x2 in zip(a, np.roll(a, shift=1, axis=0)):
    dist.append(kendalltau(x1, x2)[0])

取所有距离的方差：

np.std(dist)

或者如果您正在寻找方差 (discussed here) 然后取距离向量的范数：

np.linalg.norm(dist)

请注意，我使用 a 定义为 np.array，而不是 np.matrix:

a = np.array([[1,2,3,4,5,6], [2,3,4,6,1,5], [6,3,1,2,5,4]])

Answer 2

我假设您正在寻找可以在 3 个数组中的每一个上广播 kendalltau 函数并对其进行排列的东西。这种情况下的输出将是一个 3x3 矩阵。但是，当您说想要差异时，我不确定您在寻找什么。请在评论中澄清，我会相应地更新我的答案。希望这会有所帮助 -

a = np.array([[1,2,3,4,5,6], [2,3,4,6,1,5], [6,3,1,2,5,4]])

def f(a,b):
    return np.array(stats.kendalltau(a,b)[0])

vf = np.vectorize(f, signature='(m),(m)->()')

out = vf(a[:,None,:],a[None,:,:])
print(out)

array([[ 1.        ,  0.33333333, -0.06666667],
       [ 0.33333333,  1.        , -0.46666667],
       [-0.06666667, -0.46666667,  1.        ]])

So, how to compute (in Python) the variance on this array, i.e, how to measure how far theses permutations are from each other ?

IIUC，如果你想计算每个组合之间的 kendalltau 距离，然后检查距离之间的标准差，你可以使用 [= 过滤我们的下三角矩阵（没有对角线） 17=] 然后获取 3 个值以获取 np.std

np.std(out[np.tril_indices(out.shape[0], k=-1)])

0.3265986323710904

如何测量排列总体的方差？

How to measure variance of a population of permutations?

python

statistics

numpy

traveling-salesman

variance