需要帮助将自定义时隙分配给日期时间数据
need help assigning custom timeslots to datetime data
我有一分钟的日期时间数据(下面的示例)
2021-11-08 00:10:00
2021-11-08 01:10:00
2021-11-08 02:25:00
2021-11-08 03:55:00
2021-11-08 06:55:00
2021-11-08 12:35:00
2021-11-08 16:05:00
2021-11-08 17:10:00
2021-11-08 18:45:00
2021-11-08 19:10:00
2021-11-08 20:25:00
2021-11-08 20:55:00
2021-11-08 22:55:00
我需要在下方为该数据集分配一个自定义时间段。有些时段在整点开始(在 9:00),有些在中间(在 12:30)
'0000-0259'
'0300-0859'
'0900-1229'
'1230-1659'
'1700-1929'
'1930-2029'
'2030-2359'
我一直在尝试通过听写来做到这一点。每个小时都有一个时间段,但 1230 个时间段很棘手。
尝试 2 使用 between_time
但它需要 DateTimeIndex - 在这里不起作用
def time_slot(ref):
if ref.between_time('00:00','02:59'):
return '0000-0259'
elif ref.between_time('03:00','08:59'):
return '0300-0859'
elif ref.between_time('09:00','12:29'):
return '0900-1229'
elif ref.between_time('12:30','16:59'):
return '1230-1659'
elif ref.between_time('17:00','19:29'):
return '1700-1929'
elif ref.between_time('19:30','20:29'):
return '1930-2029'
else:
return '2030-2359'
尝试 3 已设置嵌套,如果 < 低于所选时间丢失
format = '%H:%M'
def time_slot(ref):
if ref < dt.strptime('03:00', format):
return '0000-0259'
elif ref < dt.strptime('09:00', format):
return '0300-0859'
elif ref < dt.strptime('12:30', format):
return '0900-1229'
elif ref < dt.strptime('17:00', format):
return '1700-1929'
elif ref < dt.strptime('19:30', format):
return '1930-2029'
else:
return '2030-2359'
但我没有将 datetime.time
与 datetime.datetime
进行比较。
鉴于初始时间数据是字符串格式,我将采用以下方式:
给定以下形式的数据框:
Time
0 2021-11-08 00:10:00
1 2021-11-08 01:10:00
2 2021-11-08 02:25:00
3 2021-11-08 03:55:00
4 2021-11-08 06:55:00
5 2021-11-08 12:35:00
第一步。
添加时间戳列
df['TimeStamp'] = df.apply(lambda row: du.parser.parse(row.Time), axis = 1)
制作中:
Time TimeStamp
0 2021-11-08 00:10:00 2021-11-08 00:10:00
1 2021-11-08 01:10:00 2021-11-08 01:10:00
2 2021-11-08 02:25:00 2021-11-08 02:25:00
3 2021-11-08 03:55:00 2021-11-08 03:55:00
4 2021-11-08 06:55:00 2021-11-08 06:55:00
5 2021-11-08 12:35:00 2021-11-08 12:35:00
第 2 步,创建一个函数,它将 return 每个时间戳的时隙标签如下:
def getLabel(tval):
""" Return the label associated with the timestamp """
labels = ['0000-0259', '0300-0859', '0900-1229', '1230-1659', '1700-1929', '1930-2029', '2030-2359' ]
slot_start = [(0, 0), (3, 0), (9, 0), (12, 30), (17, 0), (19,30), (20, 30)]
for lidx, tme in enumerate(slot_start):
if tme[0] > tval.hour:
return labels[lidx-1]
elif tval.hour == tme[0] and tme[1] <= tval.minute:
return labels[lidx]
return labels[-1]
步骤 3 应用 getLabel 函数创建一个 Time_Ref 列,如下所示:
df['Time_Ref'] = df.apply(lambda row: getLabel(row.TimeStamp), axis=1)
产生:
Time TimeStamp Time_Ref
0 2021-11-08 00:10:00 2021-11-08 00:10:00 0000-0259
1 2021-11-08 01:10:00 2021-11-08 01:10:00 0000-0259
2 2021-11-08 02:25:00 2021-11-08 02:25:00 0000-0259
3 2021-11-08 03:55:00 2021-11-08 03:55:00 0300-0859
4 2021-11-08 06:55:00 2021-11-08 06:55:00 0300-0859
5 2021-11-08 12:35:00 2021-11-08 12:35:00 1230-1659
6 2021-11-08 16:05:00 2021-11-08 16:05:00 1230-1659
7 2021-11-08 17:10:00 2021-11-08 17:10:00 1700-1929
8 2021-11-08 18:45:00 2021-11-08 18:45:00 1700-1929
9 2021-11-08 19:10:00 2021-11-08 19:10:00 1930-2029
10 2021-11-08 20:25:00 2021-11-08 20:25:00 2030-2359
11 2021-11-08 20:55:00 2021-11-08 20:55:00 2030-2359
您还可以将第 2 步和第 3 步与以下内容结合使用,从而消除添加时间戳列的步骤:
df['Time_Ref'] = df.apply(lambda row: getLabel(du.parser.parse(row.Time)), axis=1)
我有一分钟的日期时间数据(下面的示例)
2021-11-08 00:10:00
2021-11-08 01:10:00
2021-11-08 02:25:00
2021-11-08 03:55:00
2021-11-08 06:55:00
2021-11-08 12:35:00
2021-11-08 16:05:00
2021-11-08 17:10:00
2021-11-08 18:45:00
2021-11-08 19:10:00
2021-11-08 20:25:00
2021-11-08 20:55:00
2021-11-08 22:55:00
我需要在下方为该数据集分配一个自定义时间段。有些时段在整点开始(在 9:00),有些在中间(在 12:30)
'0000-0259'
'0300-0859'
'0900-1229'
'1230-1659'
'1700-1929'
'1930-2029'
'2030-2359'
我一直在尝试通过听写来做到这一点。每个小时都有一个时间段,但 1230 个时间段很棘手。
尝试 2 使用 between_time
但它需要 DateTimeIndex - 在这里不起作用
def time_slot(ref):
if ref.between_time('00:00','02:59'):
return '0000-0259'
elif ref.between_time('03:00','08:59'):
return '0300-0859'
elif ref.between_time('09:00','12:29'):
return '0900-1229'
elif ref.between_time('12:30','16:59'):
return '1230-1659'
elif ref.between_time('17:00','19:29'):
return '1700-1929'
elif ref.between_time('19:30','20:29'):
return '1930-2029'
else:
return '2030-2359'
尝试 3 已设置嵌套,如果 < 低于所选时间丢失
format = '%H:%M'
def time_slot(ref):
if ref < dt.strptime('03:00', format):
return '0000-0259'
elif ref < dt.strptime('09:00', format):
return '0300-0859'
elif ref < dt.strptime('12:30', format):
return '0900-1229'
elif ref < dt.strptime('17:00', format):
return '1700-1929'
elif ref < dt.strptime('19:30', format):
return '1930-2029'
else:
return '2030-2359'
但我没有将 datetime.time
与 datetime.datetime
进行比较。
鉴于初始时间数据是字符串格式,我将采用以下方式: 给定以下形式的数据框:
Time
0 2021-11-08 00:10:00
1 2021-11-08 01:10:00
2 2021-11-08 02:25:00
3 2021-11-08 03:55:00
4 2021-11-08 06:55:00
5 2021-11-08 12:35:00
第一步。 添加时间戳列
df['TimeStamp'] = df.apply(lambda row: du.parser.parse(row.Time), axis = 1)
制作中:
Time TimeStamp
0 2021-11-08 00:10:00 2021-11-08 00:10:00
1 2021-11-08 01:10:00 2021-11-08 01:10:00
2 2021-11-08 02:25:00 2021-11-08 02:25:00
3 2021-11-08 03:55:00 2021-11-08 03:55:00
4 2021-11-08 06:55:00 2021-11-08 06:55:00
5 2021-11-08 12:35:00 2021-11-08 12:35:00
第 2 步,创建一个函数,它将 return 每个时间戳的时隙标签如下:
def getLabel(tval):
""" Return the label associated with the timestamp """
labels = ['0000-0259', '0300-0859', '0900-1229', '1230-1659', '1700-1929', '1930-2029', '2030-2359' ]
slot_start = [(0, 0), (3, 0), (9, 0), (12, 30), (17, 0), (19,30), (20, 30)]
for lidx, tme in enumerate(slot_start):
if tme[0] > tval.hour:
return labels[lidx-1]
elif tval.hour == tme[0] and tme[1] <= tval.minute:
return labels[lidx]
return labels[-1]
步骤 3 应用 getLabel 函数创建一个 Time_Ref 列,如下所示:
df['Time_Ref'] = df.apply(lambda row: getLabel(row.TimeStamp), axis=1)
产生:
Time TimeStamp Time_Ref
0 2021-11-08 00:10:00 2021-11-08 00:10:00 0000-0259
1 2021-11-08 01:10:00 2021-11-08 01:10:00 0000-0259
2 2021-11-08 02:25:00 2021-11-08 02:25:00 0000-0259
3 2021-11-08 03:55:00 2021-11-08 03:55:00 0300-0859
4 2021-11-08 06:55:00 2021-11-08 06:55:00 0300-0859
5 2021-11-08 12:35:00 2021-11-08 12:35:00 1230-1659
6 2021-11-08 16:05:00 2021-11-08 16:05:00 1230-1659
7 2021-11-08 17:10:00 2021-11-08 17:10:00 1700-1929
8 2021-11-08 18:45:00 2021-11-08 18:45:00 1700-1929
9 2021-11-08 19:10:00 2021-11-08 19:10:00 1930-2029
10 2021-11-08 20:25:00 2021-11-08 20:25:00 2030-2359
11 2021-11-08 20:55:00 2021-11-08 20:55:00 2030-2359
您还可以将第 2 步和第 3 步与以下内容结合使用,从而消除添加时间戳列的步骤:
df['Time_Ref'] = df.apply(lambda row: getLabel(du.parser.parse(row.Time)), axis=1)