这是从 df 获取一堆索引排列的最佳方法

Question

这是获取一堆索引的所有排列的最佳方法。我们希望这样做的目的是运行卡方检验，我可能会在这里重新发明轮子。所以对于以下数据框

 index   value
      a     1.0
      b     2.0
      c     4.0

我想得到以下内容

group      value
      a,b     3.0
      b,c     6.0
      c,a     5.0

Answer 1

我认为您应该使用 itertools 中的组合。

>>> from itertools import combinations
>>> datas = {'a': 1, 'b': 2, 'c': 3}
>>> list(combinations(datas.keys(), 2))
[('a', 'c'), ('a', 'b'), ('c', 'b')]
>>> index_combination = combinations(datas.keys(), 2)
>>> for indexes in index_combination:
...  print indexes , sum([datas[index] for index in indexes])
...
('a', 'c') 4
('a', 'b') 3
('c', 'b') 5

Answer 2

您需要先导入itertools

import itertools

In [32]:
indices = [indices[0] + ',' + indices[1] for indices in list(itertools.combinations(df.index , 2))]
indices
Out[32]:
['a,b', 'a,c', 'b,c']


In [31]:
values = [values[0] + values[1] for values in list(itertools.combinations(df.value , 2))]
values
Out[31]:
[3.0, 5.0, 6.0]

In [36]:
pd.DataFrame(data = values , index=indices , columns=['values'])
Out[36]:
   values
a,b  3
a,c  5
b,c  6

这是从 df 获取一堆索引排列的最佳方法

Which is the best way to get permutations of a bunch of indexes from an df

python

combinations

permutation

dataframe

pandas