根据 pandas 中的特定条件将特定行转为列

Question

这是我使用的数据框，其中可以有多个客户在不同月份与某个案例 ID 相关联（case_ID、cust_val、日期在此 table).

case_ID| cust_val | date | primary | action | change   |
    1  | xx       | 3/2  |   1     |        | increase |
    1  | xx       | 3/2  |         |    1   | decrease |
    1  | xx       | 3/1  |   1     |        | decrease |
    1  | xx       | 3/1  |         |   1    | decrease |
    1  | yy       | 3/2  |   1     |        | decrease |
    1  | yy       | 3/2  |         |   1    | increase |
    2  | yy       | 3/2  |         |   1    | increase |
    2  | yy       | 3/2  |     1   |        | increase |

我希望输出 table 看起来像这样，其中对于每个 case_ID、cust_val、日期，与主要和操作相关的更改都在一行中：

case_ID| cust_val | date | primary_change | action_change |
    1  | xx       | 3/2  |   increase     |   decrease    |
    1  | xx       | 3/1  |   decrease     |   decrease    | 
    1  | yy       | 3/2  |   decrease     |   increase    | 
    2  | yy       | 3/2  |   increase     |   increase    |

我试过了，但这显然是错误的，我不确定如何解决这个问题:

df.pivot(index=['case_ID','cust_val','date'], columns=['primary', 'action'], values='change').reset_index()

感谢任何帮助。提前致谢。

Answer 1

您可以过滤数据框并合并：

a = df[df.primary == "1"]  # <-- change "1" to 1 if the values are integers
b = df[df.action == "1"]

x = (
    pd.merge(a, b, on=["case_ID", "cust_val", "date"])
    .rename(columns={"change_x": "primary_change", "change_y": "action_change"})
    .drop(columns=["primary_x", "action_x", "primary_y", "action_y"])
)
print(x)

打印：

   case_ID cust_val date primary_change action_change
0        1       xx  3/2       increase      decrease
1        1       xx  3/1       decrease      decrease
2        1       yy  3/2       decrease      increase
3        2       yy  3/2       increase      increase

根据 pandas 中的特定条件将特定行转为列

Pivot specific rows to columns based on certain conditions in pandas

python

pivot

dataframe

pandas