GroupBy 结果到列表字典

Question

我有一个 excel sheet 看起来像这样：

Column1 Column2 Column3
0       23      1
1       5       2
1       2       3
1       19      5
2       56      1
2       22      2
3       2       4
3       14      5
4       59      1
5       44      1
5       1       2
5       87      3

我希望提取该数据，将其按第 1 列分组，然后将其添加到字典中，使其显示如下：

{0: [1],
1: [2,3,5],
2: [1,2],
3: [4,5],
4: [1],
5: [1,2,3]}

到目前为止，这是我的代码

excel = pandas.read_excel(r"e:\test_data.xlsx", sheetname='mySheet', parse_cols'A,C')
myTable = excel.groupby("Column1").groups
print myTable

但是，我的输出是这样的：

{0: [0L], 1: [1L, 2L, 3L], 2: [4L, 5L], 3: [6L, 7L], 4: [8L], 5: [9L, 10L, 11L]}

谢谢！

Answer 1

根据 the docs，GroupBy.groups:

is a dict whose keys are the computed unique groups and corresponding values being the axis labels belonging to each group.

如果您想要这些值本身，您可以 groupby 'Column1' and then call apply 并传递 list 方法以应用于每个组。

然后您可以根据需要将其转换为字典：

In [5]:

dict(df.groupby('Column1')['Column3'].apply(list))
Out[5]:
{0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}

（注意：看看 this SO question 为什么数字后面跟着 L）

Answer 2

你可以在 Column1 上 groupby 然后乘坐 Column3 到 apply(list) 并打电话给 to_dict?

In [81]: df.groupby('Column1')['Column3'].apply(list).to_dict()
Out[81]: {0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}

或者，做

In [433]: {k: list(v) for k, v in df.groupby('Column1')['Column3']}
Out[433]: {0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}

GroupBy 结果到列表字典

GroupBy results to dictionary of lists

python

xlrd

pandas