Python 数据表:总和、分组依据、列 < 0

Python datatable: sum, groupby, column < 0

嗨,我正在努力将一些 R 代码翻译成 Python 代码。

这是我的 R 代码:

  df_sum <- df[, .(
    Inflow = sum(subset(Amount, Amount>0)),
    Outflow = sum(subset(Amount, Amount<0)),
    Net = sum(Amount)
  ), by = Account]

到目前为止,这是我的 Python 代码:

df_sub = df[:, {'Inflow': dt.sum(dt.f.Amount),
                'Outflow': dt.sum(dt.f.Amount),
                'Net': dt.sum(dt.f.Amount)},
         dt.by('Account')]

我不知道如何包含流入和流出列的子集。有人可以帮忙吗?

这是所需的输出(使用 R 代码生成):

     Account Inflow Outflow  Net
1: Account 1    151     -32  119
2: Account 2     51    -226 -175

示例数据:

{'Account': ['Account 1', 'Account 1', 'Account 1', 'Account 1', 'Account 1', 'Account 1', 'Account 1', 'Account 1', 'Account 1', 'Account 2', 'Account 2', 'Account 2', 'Account 2', 'Account 2', 'Account 2', 'Account 2', 'Account 2', 'Account 2'], 'Amount': [34, 23, -23, -4, 34, 4, -3, 56, -2, 3, 5, 43, -67, -3, -78, -7, -4, -67]}

使用 ifelse 函数复制您的 R 代码:

from datatable import dt, f, by, ifelse

   df[:, {"Inflow": dt.sum(ifelse(f.Amount > 0, f.Amount, None)),
          "Outflow": dt.sum(ifelse(f.Amount < 0, f.Amount, None )),
          "Net": dt.sum(f.Amount)}, 
      by("Account")]

     Account    Inflow  Outflow Net
0   Account1       151   −32    119
1   Account2        51   −226  −175