需要帮助将自定义时隙分配给日期时间数据

need help assigning custom timeslots to datetime data

我有一分钟的日期时间数据(下面的示例)

2021-11-08 00:10:00
2021-11-08 01:10:00
2021-11-08 02:25:00
2021-11-08 03:55:00
2021-11-08 06:55:00
2021-11-08 12:35:00
2021-11-08 16:05:00
2021-11-08 17:10:00
2021-11-08 18:45:00
2021-11-08 19:10:00
2021-11-08 20:25:00
2021-11-08 20:55:00
2021-11-08 22:55:00

我需要在下方为该数据集分配一个自定义时间段。有些时段在整点开始(在 9:00),有些在中间(在 12:30)

'0000-0259'
'0300-0859'
'0900-1229'
'1230-1659'
'1700-1929'
'1930-2029'
'2030-2359'

我一直在尝试通过听写来做到这一点。每个小时都有一个时间段,但 1230 个时间段很棘手。

尝试 2 使用 between_time 但它需要 DateTimeIndex - 在这里不起作用

def time_slot(ref):
    if ref.between_time('00:00','02:59'):
        return '0000-0259'
    elif ref.between_time('03:00','08:59'):
        return '0300-0859'
    elif ref.between_time('09:00','12:29'):
        return '0900-1229'
    elif ref.between_time('12:30','16:59'):
        return '1230-1659'
    elif ref.between_time('17:00','19:29'):
        return '1700-1929'
    elif ref.between_time('19:30','20:29'):
        return '1930-2029'
    else:
        return '2030-2359'

尝试 3 已设置嵌套,如果 < 低于所选时间丢失

format = '%H:%M'

def time_slot(ref):
    if ref < dt.strptime('03:00', format):
        return '0000-0259'
    elif ref < dt.strptime('09:00', format):
        return '0300-0859'
    elif ref < dt.strptime('12:30', format):
        return '0900-1229'
    elif ref < dt.strptime('17:00', format):
        return '1700-1929'
    elif ref < dt.strptime('19:30', format):
        return '1930-2029'
    else:
        return '2030-2359'

但我没有将 datetime.timedatetime.datetime 进行比较。

鉴于初始时间数据是字符串格式,我将采用以下方式: 给定以下形式的数据框:

    Time
0   2021-11-08 00:10:00
1   2021-11-08 01:10:00
2   2021-11-08 02:25:00
3   2021-11-08 03:55:00
4   2021-11-08 06:55:00
5   2021-11-08 12:35:00

第一步。 添加时间戳列

df['TimeStamp'] = df.apply(lambda row: du.parser.parse(row.Time), axis = 1)  

制作中:

    Time    TimeStamp
0   2021-11-08 00:10:00     2021-11-08 00:10:00
1   2021-11-08 01:10:00     2021-11-08 01:10:00
2   2021-11-08 02:25:00     2021-11-08 02:25:00
3   2021-11-08 03:55:00     2021-11-08 03:55:00
4   2021-11-08 06:55:00     2021-11-08 06:55:00
5   2021-11-08 12:35:00     2021-11-08 12:35:00  

第 2 步,创建一个函数,它将 return 每个时间戳的时隙标签如下:

def getLabel(tval):
    """ Return the label associated with the timestamp """
    labels = ['0000-0259', '0300-0859', '0900-1229', '1230-1659', '1700-1929', '1930-2029', '2030-2359' ]
    slot_start = [(0, 0), (3, 0), (9, 0), (12, 30), (17, 0), (19,30), (20, 30)]
    for lidx, tme in enumerate(slot_start):
        if tme[0] > tval.hour:
            return labels[lidx-1]
        elif tval.hour == tme[0] and tme[1] <= tval.minute:
            return labels[lidx]
    return labels[-1]  

步骤 3 应用 getLabel 函数创建一个 Time_Ref 列,如下所示:

df['Time_Ref'] = df.apply(lambda row: getLabel(row.TimeStamp), axis=1)

产生:

    Time    TimeStamp   Time_Ref
0   2021-11-08 00:10:00     2021-11-08 00:10:00     0000-0259
1   2021-11-08 01:10:00     2021-11-08 01:10:00     0000-0259
2   2021-11-08 02:25:00     2021-11-08 02:25:00     0000-0259
3   2021-11-08 03:55:00     2021-11-08 03:55:00     0300-0859
4   2021-11-08 06:55:00     2021-11-08 06:55:00     0300-0859
5   2021-11-08 12:35:00     2021-11-08 12:35:00     1230-1659
6   2021-11-08 16:05:00     2021-11-08 16:05:00     1230-1659
7   2021-11-08 17:10:00     2021-11-08 17:10:00     1700-1929
8   2021-11-08 18:45:00     2021-11-08 18:45:00     1700-1929
9   2021-11-08 19:10:00     2021-11-08 19:10:00     1930-2029
10  2021-11-08 20:25:00     2021-11-08 20:25:00     2030-2359
11  2021-11-08 20:55:00     2021-11-08 20:55:00     2030-2359  

您还可以将第 2 步和第 3 步与以下内容结合使用,从而消除添加时间戳列的步骤:

df['Time_Ref'] = df.apply(lambda row: getLabel(du.parser.parse(row.Time)), axis=1)