具有其他列累积结果的系列字典

Question

我有以下数据框，我试图在其中创建新的 C 列，使其基于列 'A' 和 'B' 作为字典的累积值。此外，如果列 'B' 中为“0”，则该键的条目将从 'C'

中删除

df = DataFrame({'A' : [1,2,3,2,3,2],
            'B':['Hi','Hello','HiWorld','HelloWorld','0','0']})

for indx,row in df.iterrows():
    df['C'].append(dict(zip([row['A'],row['B']])))

我正在 C 列中查找以下输出：

   A              B             C
0  1             Hi            {1:Hi}
1  2          Hello            {1:Hi,2:Hello}
2  3        HiWorld            {1:Hi,2:Hello,3:HiWorld}
3  2     HelloWorld            {1:Hi,2:HelloWorld,3:HiWorld}
4  3              0            {1:Hi,2:HelloWorld}
5  2              0            {1:Hi}

我尝试过使用 cumsum、concat 和 series.shift(1) 的潜在解决方案，但遇到了障碍。现在我遇到了使用 dict & zip 这似乎是干净的解决方案但对我不起作用。任何建议。

Answer 1

试试这个：

d = dict()
column = list()
for _, a, b in df.itertuples():
    if b != '0':
        d[a] = b
    else:
        d.pop(a, None)
    column.append(d.copy())

df['C'] = column

具有其他列累积结果的系列字典

Series dictionary with Cumulative result of other columns

python

series

pandas