为 Word 中的邮件合并重塑 Pandas 数据框
Reshaping Pandas Dataframe for Mail Merge in Word
我有一个如下所示的数据框:
Customer ID Invoice ID Invoice Total Customer Total
8063863 110100456 41,47 248,82
8063863 110100677 41,47 248,82
8063863 110100838 41,47 248,82
8063863 110101106 41,47 248,82
8063863 110101259 41,47 248,82
8063863 110101401 41,47 248,82
我想要的是这样的:
Customer ID Invoice_1 Invoice_Total_1 Invoice_2 Invoice_Total_2 ... Customer_Total
8063863 110100456 41,47 110100677 41,47 248,82
然后我想将 Dataframe 导出到 csv 并在 Word 中使用它来邮件合并不同客户端的摘要。
我使用 pivot_table 在 Pandas 中添加了 Customer Total,但我在 Dataframe 的展平上卡住了。
让我们试试这个:
def f(x):
n,i = pd.factorize(x['Invoice ID'])
df1 = pd.DataFrame([x.loc[(x['Invoice ID']==i.values),'Invoice Total'].values], columns=(n+1).astype(str)).add_prefix('Invoice_Total_')
df2 = pd.DataFrame([i.values],columns=(n+1).astype(str)).add_prefix('Invoice_')
return pd.concat([df1,df2],axis=1).assign(Customer_Total=x['Customer Total'].max()),drop=True)
df_out = df.groupby('Customer ID').apply(f).reset_index(-1,drop=True)
输出:
Invoice_Total_1 Invoice_Total_2 Invoice_Total_3 Invoice_Total_4 \
Customer ID
8063863 41,47 41,47 41,47 41,47
Invoice_Total_5 Invoice_Total_6 Invoice_1 Invoice_2 Invoice_3 \
Customer ID
8063863 41,47 41,47 110100456 110100677 110100838
Invoice_4 Invoice_5 Invoice_6 Customer_Total
Customer ID
8063863 110101106 110101259 110101401 248,82
我有一个如下所示的数据框:
Customer ID Invoice ID Invoice Total Customer Total
8063863 110100456 41,47 248,82
8063863 110100677 41,47 248,82
8063863 110100838 41,47 248,82
8063863 110101106 41,47 248,82
8063863 110101259 41,47 248,82
8063863 110101401 41,47 248,82
我想要的是这样的:
Customer ID Invoice_1 Invoice_Total_1 Invoice_2 Invoice_Total_2 ... Customer_Total
8063863 110100456 41,47 110100677 41,47 248,82
然后我想将 Dataframe 导出到 csv 并在 Word 中使用它来邮件合并不同客户端的摘要。
我使用 pivot_table 在 Pandas 中添加了 Customer Total,但我在 Dataframe 的展平上卡住了。
让我们试试这个:
def f(x):
n,i = pd.factorize(x['Invoice ID'])
df1 = pd.DataFrame([x.loc[(x['Invoice ID']==i.values),'Invoice Total'].values], columns=(n+1).astype(str)).add_prefix('Invoice_Total_')
df2 = pd.DataFrame([i.values],columns=(n+1).astype(str)).add_prefix('Invoice_')
return pd.concat([df1,df2],axis=1).assign(Customer_Total=x['Customer Total'].max()),drop=True)
df_out = df.groupby('Customer ID').apply(f).reset_index(-1,drop=True)
输出:
Invoice_Total_1 Invoice_Total_2 Invoice_Total_3 Invoice_Total_4 \
Customer ID
8063863 41,47 41,47 41,47 41,47
Invoice_Total_5 Invoice_Total_6 Invoice_1 Invoice_2 Invoice_3 \
Customer ID
8063863 41,47 41,47 110100456 110100677 110100838
Invoice_4 Invoice_5 Invoice_6 Customer_Total
Customer ID
8063863 110101106 110101259 110101401 248,82