每个不同参与者的相应周列以及开始日期
Corresponding week column for each different participant with starting dates
我有一个示例数据框,如下所示。
import pandas as pd
data = {'ID':['A','A','A','A','A', 'B','B','B','C','C','C','C'],
'Date':['9/20/2021 4:34:57 AM', '9/21/2021 6:54:22 PM', '9/28/2021 5:54:22 PM',
'9/30/2021 6:54:22 PM', '10/5/2021 6:54:22 PM',
'9/4/2021 7:04:38 PM','9/9/2021 7:04:38 PM','9/17/2021 7:04:38 PM',
'9/28/2021 3:21:23 AM','9/30/2021 3:21:23 AM','10/05/2021 3:21:23
AM','10/15/2021 3:21:23 AM']}
df1 = pd.DataFrame(data)
df1
'Date' 列采用日期时间格式。
现在对于每个参与者,第一个条目是注册日期,这将是该特定参与者第一周的开始。我想要后续日期条目的相应周数,将第一个条目视为第 0 天。
最终的数据帧应该如下图所示。
非常感谢任何帮助。谢谢。
尝试:
df['Date'] = pd.to_datetime(df['Date'])
df['Week'] = df.groupby('ID')['Date'] \
.apply(lambda t: 1 + (t - min(t)).dt.days // 7)
df['Week'] = 'Week' + df['Week'].astype(str)
输出:
>>> df
ID Date Week
0 A 2021-09-20 04:34:57 Week1
1 A 2021-09-21 18:54:22 Week1
2 A 2021-09-28 17:54:22 Week2
3 A 2021-09-30 18:54:22 Week2
4 A 2021-10-05 18:54:22 Week3
5 B 2021-09-04 19:04:38 Week1
6 B 2021-09-09 19:04:38 Week1
7 B 2021-09-17 19:04:38 Week2
8 C 2021-09-28 03:21:23 Week1
9 C 2021-09-30 03:21:23 Week1
10 C 2021-10-05 03:21:23 Week2
11 C 2021-10-15 03:21:23 Week3
由@ddejohn修复
您可以使用:
df1['Date'] = pd.to_datetime(df1['Date'])
df1['Week'] = (df1.groupby('ID')['Date'].diff().dt.days
.groupby(df1['ID']).cumsum()
.fillna(0, downcast='infer') // 7 + 1)
df1['Week'] = 'Week' + df1['Week'].astype(str)
结果:
print(df1)
ID Date Week
0 A 2021-09-20 04:34:57 Week1
1 A 2021-09-21 18:54:22 Week1
2 A 2021-09-28 17:54:22 Week2
3 A 2021-09-30 18:54:22 Week2
4 A 2021-10-05 18:54:22 Week3
5 B 2021-09-04 19:04:38 Week1
6 B 2021-09-09 19:04:38 Week1
7 B 2021-09-17 19:04:38 Week2
8 C 2021-09-28 03:21:23 Week1
9 C 2021-09-30 03:21:23 Week1
10 C 2021-10-05 03:21:23 Week2
11 C 2021-10-15 03:21:23 Week3
我有一个示例数据框,如下所示。
import pandas as pd
data = {'ID':['A','A','A','A','A', 'B','B','B','C','C','C','C'],
'Date':['9/20/2021 4:34:57 AM', '9/21/2021 6:54:22 PM', '9/28/2021 5:54:22 PM',
'9/30/2021 6:54:22 PM', '10/5/2021 6:54:22 PM',
'9/4/2021 7:04:38 PM','9/9/2021 7:04:38 PM','9/17/2021 7:04:38 PM',
'9/28/2021 3:21:23 AM','9/30/2021 3:21:23 AM','10/05/2021 3:21:23
AM','10/15/2021 3:21:23 AM']}
df1 = pd.DataFrame(data)
df1
'Date' 列采用日期时间格式。
现在对于每个参与者,第一个条目是注册日期,这将是该特定参与者第一周的开始。我想要后续日期条目的相应周数,将第一个条目视为第 0 天。
最终的数据帧应该如下图所示。
非常感谢任何帮助。谢谢。
尝试:
df['Date'] = pd.to_datetime(df['Date'])
df['Week'] = df.groupby('ID')['Date'] \
.apply(lambda t: 1 + (t - min(t)).dt.days // 7)
df['Week'] = 'Week' + df['Week'].astype(str)
输出:
>>> df
ID Date Week
0 A 2021-09-20 04:34:57 Week1
1 A 2021-09-21 18:54:22 Week1
2 A 2021-09-28 17:54:22 Week2
3 A 2021-09-30 18:54:22 Week2
4 A 2021-10-05 18:54:22 Week3
5 B 2021-09-04 19:04:38 Week1
6 B 2021-09-09 19:04:38 Week1
7 B 2021-09-17 19:04:38 Week2
8 C 2021-09-28 03:21:23 Week1
9 C 2021-09-30 03:21:23 Week1
10 C 2021-10-05 03:21:23 Week2
11 C 2021-10-15 03:21:23 Week3
由@ddejohn修复
您可以使用:
df1['Date'] = pd.to_datetime(df1['Date'])
df1['Week'] = (df1.groupby('ID')['Date'].diff().dt.days
.groupby(df1['ID']).cumsum()
.fillna(0, downcast='infer') // 7 + 1)
df1['Week'] = 'Week' + df1['Week'].astype(str)
结果:
print(df1)
ID Date Week
0 A 2021-09-20 04:34:57 Week1
1 A 2021-09-21 18:54:22 Week1
2 A 2021-09-28 17:54:22 Week2
3 A 2021-09-30 18:54:22 Week2
4 A 2021-10-05 18:54:22 Week3
5 B 2021-09-04 19:04:38 Week1
6 B 2021-09-09 19:04:38 Week1
7 B 2021-09-17 19:04:38 Week2
8 C 2021-09-28 03:21:23 Week1
9 C 2021-09-30 03:21:23 Week1
10 C 2021-10-05 03:21:23 Week2
11 C 2021-10-15 03:21:23 Week3