透视包含具有不同值的重复列名称的 Pandas 数据框

Question

我有一个关于 pivot_table python pandas 的问题。

我有一个这样的数据框

Agent   Detail                  Value
report1 General Section         YESS
report1 jobID                   558
report1 Priority                normal
report1 Run As                  Owner's Credentials
report1 Schedule Section    
report1 disabled                TRUE
report1 timeZoneId  None
report1 startImmediately       FALSE
report1 repeatMinuteInterval    None
report1 start date              None
report1 start time              None
report1 Email Recipient         abc@xyz.com
report1 Email Recipient         xyz@sbc.com
report2 General Section         YESS
report2 jobID                   559
report2 Priority                normal
report2 Run As                  Owner's Credentials
report2 Schedule Section    
report2 disabled                TRUE
report2 timeZoneId              None
report2 startImmediately        FALSE
report2 repeatMinuteInterval    None
report2 start date              None
report2 start time              None
report2 Email Recipient         abc123@xyz.com
report2 Email Recipient         xyz11123@sbc.com

我正在尝试旋转数据框并将所有详细值转换为列。索引是代理字段，它是一个报告名称。每个报告可以有多个收件人。我需要为每个报告的收件人设置每一行。示例输出如下：

[在此处输入图片描述]

我当前的代码如下：

import csv
import pandas as pd
resultsFile = 'C:\Oracle\testfile.csv'    #input to transpose file
df=pd.read_csv(resultsFile,skip_blank_lines=True)
df2=df.pivot_table(index='Agent',columns='Detail',values='Value',aggfunc='sum')
df2

这是在单个字段中连接电子邮件地址，这不是我要找的东西？如何旋转具有重复列值的 df 并将它们转换为多行？

感谢您的帮助

Answer 1

您可以按 agent 对您的 df 进行分组并旋转组（以原始索引作为索引）。您必须填写 NaN 值并删除重复项，因为每个值一行：

reports = []
for a, sub_df in df.groupby('Agent'):
    rep = sub_df.pivot(None, 'Detail', 'Value').ffill().bfill().drop_duplicates()
    rep.insert(0, 'Agent', a)
    reports.append(rep)

result = pd.concat(reports).reset_index()
print(result)

输出：

Detail    Agent   Email Recipient General Section Priority               Run As  ...  repeatMinuteInterval start date start time startImmediately timeZoneId
0       report1       abc@xyz.com            YESS   normal  Owner's Credentials  ...                  None       None       None            FALSE       None
1       report1       xyz@sbc.com            YESS   normal  Owner's Credentials  ...                  None       None       None            FALSE       None
2       report2    abc123@xyz.com            YESS   normal  Owner's Credentials  ...                  None       None       None            FALSE       None
3       report2  xyz11123@sbc.com            YESS   normal  Owner's Credentials  ...                  None       None       None            FALSE       None

透视包含具有不同值的重复列名称的 Pandas 数据框

Pivoting a Pandas Dataframe containing duplicate column name with different values

python

pivot-table

dataframe

pandas