Pandas pivot table + 保留 2 个附加列
Pandas pivot table + keep 2 additional columns
我正在尝试旋转此 df:
data = {
'account': ['Account 1', 'Account 1', 'Account 1', 'Account 1', 'Account 1', 'Account 2'],
'product': ['Product 1', 'Product 1', 'Product 1', 'Product 2', 'Product 3', 'Product 1'],
'metric': ['Meric 1', 'Meric 1', 'Meric 2', 'Meric 1', 'Meric 1', 'Meric 1'],
'date': ['Date 1', 'Date 2', 'Date 3', 'Date 4', 'Date 5', 'Date 1'],
'value': [1, 2, 3, 4, 5, 6]
}
pd.DataFrame(data)
account product metric date value
0 Account 1 Product 1 Meric 1 Date 1 1
1 Account 1 Product 1 Meric 1 Date 2 2
2 Account 1 Product 1 Meric 2 Date 3 3
3 Account 1 Product 2 Meric 1 Date 4 4
4 Account 1 Product 3 Meric 1 Date 5 5
5 Account 2 Product 1 Meric 1 Date 1 6
像这样进入视图,但按原样添加日期和产品列
new.pivot_table(index='account', columns='metric', values='value')
我目前有什么:
metric Meric 1 Meric 2
account
Account 1 3.0 3.0
Account 2 6.0 NaN
我在找什么:
metric Meric 1 Meric 2 product date
account
Account 1 1.0 Nan Product 1 Date 1
Account 1 2.0 Nan Product 1 Date 2
Account 1 Nan 3 Product 1 Date 3
...
唯一的问题是该帐户将重复,但这正是我想要的 - 如果我们在不同日期有相同的产品。
将两列添加到 pivot_table
中的参数 index
,然后将第二级和第三级转换为列并更改列的顺序:
df = (new.pivot_table(index=['account','product','date'], columns='metric', values='value')
.reset_index(level=[1,2]))
df = df[df.columns[2:].tolist() + df.columns[:2].tolist()]
print (df)
metric Meric 1 Meric 2 product date
account
Account 1 1.0 NaN Product 1 Date 1
Account 1 2.0 NaN Product 1 Date 2
Account 1 NaN 3.0 Product 1 Date 3
Account 1 4.0 NaN Product 2 Date 4
Account 1 5.0 NaN Product 3 Date 5
Account 2 6.0 NaN Product 1 Date 1
我正在尝试旋转此 df:
data = {
'account': ['Account 1', 'Account 1', 'Account 1', 'Account 1', 'Account 1', 'Account 2'],
'product': ['Product 1', 'Product 1', 'Product 1', 'Product 2', 'Product 3', 'Product 1'],
'metric': ['Meric 1', 'Meric 1', 'Meric 2', 'Meric 1', 'Meric 1', 'Meric 1'],
'date': ['Date 1', 'Date 2', 'Date 3', 'Date 4', 'Date 5', 'Date 1'],
'value': [1, 2, 3, 4, 5, 6]
}
pd.DataFrame(data)
account product metric date value
0 Account 1 Product 1 Meric 1 Date 1 1
1 Account 1 Product 1 Meric 1 Date 2 2
2 Account 1 Product 1 Meric 2 Date 3 3
3 Account 1 Product 2 Meric 1 Date 4 4
4 Account 1 Product 3 Meric 1 Date 5 5
5 Account 2 Product 1 Meric 1 Date 1 6
像这样进入视图,但按原样添加日期和产品列
new.pivot_table(index='account', columns='metric', values='value')
我目前有什么:
metric Meric 1 Meric 2
account
Account 1 3.0 3.0
Account 2 6.0 NaN
我在找什么:
metric Meric 1 Meric 2 product date
account
Account 1 1.0 Nan Product 1 Date 1
Account 1 2.0 Nan Product 1 Date 2
Account 1 Nan 3 Product 1 Date 3
...
唯一的问题是该帐户将重复,但这正是我想要的 - 如果我们在不同日期有相同的产品。
将两列添加到 pivot_table
中的参数 index
,然后将第二级和第三级转换为列并更改列的顺序:
df = (new.pivot_table(index=['account','product','date'], columns='metric', values='value')
.reset_index(level=[1,2]))
df = df[df.columns[2:].tolist() + df.columns[:2].tolist()]
print (df)
metric Meric 1 Meric 2 product date
account
Account 1 1.0 NaN Product 1 Date 1
Account 1 2.0 NaN Product 1 Date 2
Account 1 NaN 3.0 Product 1 Date 3
Account 1 4.0 NaN Product 2 Date 4
Account 1 5.0 NaN Product 3 Date 5
Account 2 6.0 NaN Product 1 Date 1