"No numeric types to aggregate" 同时使用 Pandas expanding()

Question

在 Pandas 1.1.4 中，我收到 DataError：使用 ExpandingGroupby 时没有要聚合的数字类型。

示例数据集：

tmp = pd.DataFrame({'col1':['a','b','b','c','d','d'], 'col2': ['red','red','green','green','red','blue']})

print(tmp)

col1    col2
a       red
b       red
b       green
c       green
d       red
d       blue

这个有效：

tmp.groupby('col1').agg(lambda x: ','.join(x))

这有效：

tmp.groupby('col1').expanding().agg('count')

但是这个returns一个错误：

tmp.groupby('col1').expanding().agg(lambda x: ','.join(x))

数据错误：没有要聚合的数字类型

没有概念上的原因这不应该起作用，网上有一些关于在 ExpandingGroupby 中使用自定义函数的人的参考资料。

这显然没有理由必须是数字，特别是考虑到计数适用于非数字列。这里发生了什么？如果由于某种原因无法在本机完成，我该如何手动完成？

Answer 1

如果您想将前一行的值连接到组内的下一行，也许您可以使用 cumsum 并在进行时添加字符串：

tmp['expading_join'] = tmp.groupby('col1')['col2'].apply(lambda x: (x + ',').cumsum()).str.rstrip(',')

输出：

  col1   col2 expading_join
0    a    red           red
1    b    red           red
2    b  green     red,green
3    c  green         green
4    d    red           red
5    d   blue      red,blue

Answer 2

您可以使用 itertools 模块中的 accumulate：

from itertools import accumulate

concat = lambda *args: ','.join(args)
expand = lambda x: list(accumulate(x, func=concat))

df['col3'] = df.groupby('col1')['col2'].transform(expand)
print(df)

# Output
  col1   col2       col3
0    a    red        red
1    b    red        red
2    b  green  red,green
3    c  green      green
4    d    red        red
5    d   blue   red,blue

更新

一行版本：

df['col3'] = df.groupby('col1')['col2'].transform(lambda x: list(accumulate(x, func=lambda *args: ','.join(args))))

Answer 3

我找到的第三个选项：

tmp['col3'] = tmp.groupby('col1')['col2'].transform(lambda x: [';'.join(x[:i+1]) for i in range(len(x))])

把它放在那里以防对任何人有用；但是，@enke 和@Corralien 的两个选项都比较好。

在大型数据集上测试，时间为：

accumulate: 0:13
apply: 0:25
for loop:  2:28

由于 accumulate 选项更快更直观，我将其标记为已接受的答案，尽管另一个非常好，因为它是一个没有导入的单行。

"No numeric types to aggregate" 同时使用 Pandas expanding()

"No numeric types to aggregate" while using Pandas expanding()

python

dataframe

pandas

pandas-groupby