每 5 分钟对值进行分组和求和/使用字符串值对数据重新采样 5 分钟
Grouping and sum the value for every 5min / resampling the data for 5min with string values
我想对每 5 分钟时间戳的每个性别的值求和。
主要Table:-
Time Gender value
10:01 Male 5
10:02 Female 1
10:03 Male 5
10:04 Male 5
10:05 Female 1
10:06 Female 1
10:07 Male 5
10:08 Male 5
10:09 Male 5
10:10 Male 5
要求的结果:-
Time Gender value
10:00 Male 15
10:00 Female 2
10:05 Male 20
10:05 Female 1
您可以将结果转换为 TimeDelta
、floor
,并将其用于 groupby
+agg
:
t = pd.to_timedelta(df['Time']+':00')
(df
.groupby([t.dt.floor('5min'), 'Gender'])
.agg({'value': 'sum'})
.reset_index()
)
输出:
Time Gender value
0 0 days 10:00:00 Female 1
1 0 days 10:00:00 Male 15
2 0 days 10:05:00 Female 2
3 0 days 10:05:00 Male 15
4 0 days 10:10:00 Male 5
匹配提供的输出
为了匹配您提供的输出,它还需要一些东西。
- 在“00:00:00”从“00:05:00”减去一分钟
- 转换回字符串
t = pd.to_timedelta(df['Time']+':00').sub(pd.to_timedelta('1min'))
(df
.groupby([t.dt.floor('5min'), 'Gender'])
.agg({'value': 'sum'})
.reset_index()
.assign(Time=lambda d: (pd.to_datetime(0)+d['Time']).dt.strftime('%H:%M'))
)
输出:
Time Gender value
0 10:00 Female 2
1 10:00 Male 15
2 10:05 Female 1
3 10:05 Male 20
变体
t = pd.to_timedelta(df['Time']+':00').sub(pd.to_timedelta('1min'))
(df.assign(Time=t.dt.floor('5min').astype(str).str[-8:-3])
.groupby(['Time', 'Gender'])
['value'].sum().reset_index()
)
我想对每 5 分钟时间戳的每个性别的值求和。
主要Table:-
Time Gender value
10:01 Male 5
10:02 Female 1
10:03 Male 5
10:04 Male 5
10:05 Female 1
10:06 Female 1
10:07 Male 5
10:08 Male 5
10:09 Male 5
10:10 Male 5
要求的结果:-
Time Gender value
10:00 Male 15
10:00 Female 2
10:05 Male 20
10:05 Female 1
您可以将结果转换为 TimeDelta
、floor
,并将其用于 groupby
+agg
:
t = pd.to_timedelta(df['Time']+':00')
(df
.groupby([t.dt.floor('5min'), 'Gender'])
.agg({'value': 'sum'})
.reset_index()
)
输出:
Time Gender value
0 0 days 10:00:00 Female 1
1 0 days 10:00:00 Male 15
2 0 days 10:05:00 Female 2
3 0 days 10:05:00 Male 15
4 0 days 10:10:00 Male 5
匹配提供的输出
为了匹配您提供的输出,它还需要一些东西。
- 在“00:00:00”从“00:05:00”减去一分钟
- 转换回字符串
t = pd.to_timedelta(df['Time']+':00').sub(pd.to_timedelta('1min'))
(df
.groupby([t.dt.floor('5min'), 'Gender'])
.agg({'value': 'sum'})
.reset_index()
.assign(Time=lambda d: (pd.to_datetime(0)+d['Time']).dt.strftime('%H:%M'))
)
输出:
Time Gender value
0 10:00 Female 2
1 10:00 Male 15
2 10:05 Female 1
3 10:05 Male 20
变体
t = pd.to_timedelta(df['Time']+':00').sub(pd.to_timedelta('1min'))
(df.assign(Time=t.dt.floor('5min').astype(str).str[-8:-3])
.groupby(['Time', 'Gender'])
['value'].sum().reset_index()
)