分组依据 - Python / Pandas 中的总时数和按类别分类的时数
Group By - Total Hours and Hours by Category in Python / Pandas
我需要使用 Python / Pandas [按状态 计算每周 总小时数 和 小时数 分组依据.
Id Week Status Hours
1 01/10/2022 - 01/16/2022 On 5
2 01/10/2022 - 01/16/2022 Off 2
3 01/17/2022 - 01/23/2022 Off 6
4 01/17/2022 - 01/23/2022 On 1
5 01/17/2022 - 01/23/2022 On 5
6 01/03/2022 - 01/09/2022 On 10
7 01/10/2022 - 01/16/2022 Off 9
8 01/03/2022 - 01/09/2022 On 3
9 01/24/2022 - 01/30/2022 Off 4
10 01/24/2022 - 01/30/2022 On 7
test_data = {'Id': [1,2,3,4,5,6,7,8,9,10],
'Week': ['01/10/2022 - 01/16/2022', '01/10/2022 - 01/16/2022', '01/17/2022 - 01/23/2022', '01/17/2022 - 01/23/2022', '01/17/2022 - 01/23/2022', '01/03/2022 - 01/09/2022', '01/10/2022 - 01/16/2022', '01/03/2022 - 01/09/2022', '01/24/2022 - 01/30/2022', '01/24/2022 - 01/30/2022'],
'Status': ['On', 'Off', 'Off', 'On', 'On', 'On', 'Off', 'On', 'Off', 'On'],
'Hours': [5,2,6,1,5,10,9,3,4,7]}
test_df = pd.DataFrame(data=test_data)
我每周可以获得总小时数:
test_df.groupby(by=['Week'], as_index=False).agg({"Hours": "sum"})
但我不知道如何也按状态分组,所以它将是 2 个额外的列(On Status Hours and Off Status Hours)
如果我只将 Status 列添加到 groupby 部分,它会创建额外的行(我明白为什么)
test_df.groupby(by=['Week', 'Status'], as_index=False).agg({"Hours": "sum"})
我想要的输出:
Week
Total Hours
On Status Hours
Off Status Hours
01/03/2022 - 01/09/2022
13
13
0
01/10/2022 - 01/16/2022
16
5
11
01/17/2022 - 01/23/2022
12
6
6
01/24/2022 - 01/30/2022
11
7
4
您可以使用 pd.pivot_table
得到您的结果:
x = pd.pivot_table(
test_df,
index="Week",
columns="Status",
values="Hours",
aggfunc="sum",
fill_value=0,
).add_suffix(" Status Hours")
x["Total Hours"] = x.sum(axis=1)
print(x)
打印:
Status Off Status Hours On Status Hours Total Hours
Week
01/03/2022 - 01/09/2022 0 13 13
01/10/2022 - 01/16/2022 11 5 16
01/17/2022 - 01/23/2022 6 6 12
01/24/2022 - 01/30/2022 4 7 11
您可以使用:
(test_df
.groupby(['Week', 'Status'])['Hours']
.sum()
.unstack(1, fill_value=0)
.add_suffix(' Status Hours')
.assign(**{'Total Hours': lambda d: d.sum(1)})
)
输出:
Status Off Status Hours On Status Hours Total Hours
Week
01/03/2022 - 01/09/2022 0 13 13
01/10/2022 - 01/16/2022 11 5 16
01/17/2022 - 01/23/2022 6 6 12
01/24/2022 - 01/30/2022 4 7 11
我需要使用 Python / Pandas [按状态 计算每周 总小时数 和 小时数 分组依据.
Id Week Status Hours
1 01/10/2022 - 01/16/2022 On 5
2 01/10/2022 - 01/16/2022 Off 2
3 01/17/2022 - 01/23/2022 Off 6
4 01/17/2022 - 01/23/2022 On 1
5 01/17/2022 - 01/23/2022 On 5
6 01/03/2022 - 01/09/2022 On 10
7 01/10/2022 - 01/16/2022 Off 9
8 01/03/2022 - 01/09/2022 On 3
9 01/24/2022 - 01/30/2022 Off 4
10 01/24/2022 - 01/30/2022 On 7
test_data = {'Id': [1,2,3,4,5,6,7,8,9,10],
'Week': ['01/10/2022 - 01/16/2022', '01/10/2022 - 01/16/2022', '01/17/2022 - 01/23/2022', '01/17/2022 - 01/23/2022', '01/17/2022 - 01/23/2022', '01/03/2022 - 01/09/2022', '01/10/2022 - 01/16/2022', '01/03/2022 - 01/09/2022', '01/24/2022 - 01/30/2022', '01/24/2022 - 01/30/2022'],
'Status': ['On', 'Off', 'Off', 'On', 'On', 'On', 'Off', 'On', 'Off', 'On'],
'Hours': [5,2,6,1,5,10,9,3,4,7]}
test_df = pd.DataFrame(data=test_data)
我每周可以获得总小时数:
test_df.groupby(by=['Week'], as_index=False).agg({"Hours": "sum"})
但我不知道如何也按状态分组,所以它将是 2 个额外的列(On Status Hours and Off Status Hours)
如果我只将 Status 列添加到 groupby 部分,它会创建额外的行(我明白为什么)
test_df.groupby(by=['Week', 'Status'], as_index=False).agg({"Hours": "sum"})
我想要的输出:
Week | Total Hours | On Status Hours | Off Status Hours |
---|---|---|---|
01/03/2022 - 01/09/2022 | 13 | 13 | 0 |
01/10/2022 - 01/16/2022 | 16 | 5 | 11 |
01/17/2022 - 01/23/2022 | 12 | 6 | 6 |
01/24/2022 - 01/30/2022 | 11 | 7 | 4 |
您可以使用 pd.pivot_table
得到您的结果:
x = pd.pivot_table(
test_df,
index="Week",
columns="Status",
values="Hours",
aggfunc="sum",
fill_value=0,
).add_suffix(" Status Hours")
x["Total Hours"] = x.sum(axis=1)
print(x)
打印:
Status Off Status Hours On Status Hours Total Hours
Week
01/03/2022 - 01/09/2022 0 13 13
01/10/2022 - 01/16/2022 11 5 16
01/17/2022 - 01/23/2022 6 6 12
01/24/2022 - 01/30/2022 4 7 11
您可以使用:
(test_df
.groupby(['Week', 'Status'])['Hours']
.sum()
.unstack(1, fill_value=0)
.add_suffix(' Status Hours')
.assign(**{'Total Hours': lambda d: d.sum(1)})
)
输出:
Status Off Status Hours On Status Hours Total Hours
Week
01/03/2022 - 01/09/2022 0 13 13
01/10/2022 - 01/16/2022 11 5 16
01/17/2022 - 01/23/2022 6 6 12
01/24/2022 - 01/30/2022 4 7 11