pandas 数据帧根据条件冻结

pandas dataframe to frozenset based on conditions

我有一个像这样的数据集:

 node    community
  1         2
  2         4
  3         5
  4         2
  5         3
  7         1
  8         3
  10        4
  12        5

我想以他们的社区相同的方式拥有节点列的冻结集。因此,预期结果类似于:

 [frozenset([1,4]), frozenset([2,10]), frozenset([3,12]),frozenset([5,8]),frozenset([1])]

有什么方法可以在不将数据框更改为列表列表的情况下做到这一点。 谢谢

使用 GroupBy + applyfrozenset:

res = df.groupby('community')['node'].apply(frozenset).values.tolist()

print(res)

[frozenset({7}), frozenset({1, 4}), frozenset({8, 5}),
 frozenset({2, 10}), frozenset({3, 12})]

我建议迭代您的 GroupBy 对象并发出地图。

communities = {k: frozenset(g['node']) for k, g in df.groupby('community')}
print(communities)
{1: frozenset({7}),
 2: frozenset({1, 4}),
 3: frozenset({5, 8}),
 4: frozenset({2, 10}),
 5: frozenset({3, 12})}

或者,如果您想要一个列表(您会丢失有关键的信息),那么

communities = [frozenset(g['node']) for _, g in df.groupby('community')]