pandas如何将数据分为夜间数据和白天数据
How to divide data into night time data and day time data in pandas
各位,
我需要帮助将 pandas 这个数据帧分成夜间和白天数据。让我们假设夜晚在 17:00 之后和 08:30 之前,白天在 08:30 和 17:00 之间。
Date Time Open High Low Close Vol
7 2019-09-02 05:00 11919.9 11929.7 11917.7 11918.9 240
8 2019-09-02 06:00 11920.7 11940.4 11917.7 11927.9 240
9 2019-09-02 07:00 11927.4 11966.2 11927.2 11936.4 240
10 2019-09-02 08:00 11936.9 11955.9 11928.1 11951.4 240
11 2019-09-02 09:00 11951.4 11960.2 11939.4 11954.4 240
12 2019-09-02 10:00 11953.9 11995.9 11951.4 11976.9 240
13 2019-09-02 11:00 11976.7 11979.4 11956.2 11965.9 240
14 2019-09-02 12:00 11966.2 11971.4 11956.4 11965.4 240
15 2019-09-02 13:00 11965.7 11969.7 11943.4 11947.7 240
16 2019-09-02 14:00 11947.4 11962.4 11943.9 11960.7 240
17 2019-09-02 15:00 11960.9 11964.2 11901.2 11934.9 240
18 2019-09-02 16:00 11934.9 11939.7 11921.4 11929.7 240
19 2019-09-02 17:00 11929.9 11940.4 11928.4 11938.2 236
20 2019-09-02 18:00 11937.9 11938.2 11934.7 11938.2 176
21 2019-09-02 19:00 11937.9 11948.7 11937.7 11943.2 196
between_time 仅显示当前日期的时间,因此仅此一项不会显示。
一个想法是将 Time
列转换为时间增量并使用 Series.between
:
通过布尔掩码过滤
mask = (pd.to_timedelta(df['Time'].astype(str).add(':00'))
.between(pd.Timedelta('08:30:00'), pd.Timedelta('17:00:00')))
df1 = df[mask]
print (df1)
Date Time Open High Low Close Vol
11 2019-09-02 09:00 11951.4 11960.2 11939.4 11954.4 240
12 2019-09-02 10:00 11953.9 11995.9 11951.4 11976.9 240
13 2019-09-02 11:00 11976.7 11979.4 11956.2 11965.9 240
14 2019-09-02 12:00 11966.2 11971.4 11956.4 11965.4 240
15 2019-09-02 13:00 11965.7 11969.7 11943.4 11947.7 240
16 2019-09-02 14:00 11947.4 11962.4 11943.9 11960.7 240
17 2019-09-02 15:00 11960.9 11964.2 11901.2 11934.9 240
18 2019-09-02 16:00 11934.9 11939.7 11921.4 11929.7 240
19 2019-09-02 17:00 11929.9 11940.4 11928.4 11938.2 236
df2 = df[~mask]
print (df2)
Date Time Open High Low Close Vol
7 2019-09-02 05:00 11919.9 11929.7 11917.7 11918.9 240
8 2019-09-02 06:00 11920.7 11940.4 11917.7 11927.9 240
9 2019-09-02 07:00 11927.4 11966.2 11927.2 11936.4 240
10 2019-09-02 08:00 11936.9 11955.9 11928.1 11951.4 240
20 2019-09-02 18:00 11937.9 11938.2 11934.7 11938.2 176
21 2019-09-02 19:00 11937.9 11948.7 11937.7 11943.2 196
编辑:
DataFrame.between_time
的另一个想法,但有必要 DatetimeIndex
:
df['Datetime'] = pd.to_datetime(df['Date'].astype(str) + ':' + df['Time'].astype(str))
df = df.set_index('Datetime')
day = df.between_time('09:00','17:00')
night = df[~df.index.isin(day.index)]
我会尝试这样的事情,显然将时间更改为您需要的时间!但这是一般的想法。
In [58]: df = pd.DataFrame({"Time":[
...: "05:00",
...: "06:00",
...: "07:00",
...: "08:00",
...: "09:00",
...: "10:00",
...: "11:00",
...: "12:00",
...: "13:00",
...: "14:00",
...: "15:00",
...: "16:00",
...: "17:00",
...: "18:00",
...: "19:00"]})
In [59]: df = df.set_index(pd.to_datetime(df["Time"]))
In [60]: df
Out[60]:
Time
Time
2019-09-15 05:00:00 05:00
2019-09-15 06:00:00 06:00
2019-09-15 07:00:00 07:00
2019-09-15 08:00:00 08:00
2019-09-15 09:00:00 09:00
2019-09-15 10:00:00 10:00
2019-09-15 11:00:00 11:00
2019-09-15 12:00:00 12:00
2019-09-15 13:00:00 13:00
2019-09-15 14:00:00 14:00
2019-09-15 15:00:00 15:00
2019-09-15 16:00:00 16:00
2019-09-15 17:00:00 17:00
2019-09-15 18:00:00 18:00
2019-09-15 19:00:00 19:00
In [61]: df["time_desc"] = "night"
In [62]: df
Out[62]:
Time time_desc
Time
2019-09-15 05:00:00 05:00 night
2019-09-15 06:00:00 06:00 night
2019-09-15 07:00:00 07:00 night
2019-09-15 08:00:00 08:00 night
2019-09-15 09:00:00 09:00 night
2019-09-15 10:00:00 10:00 night
2019-09-15 11:00:00 11:00 night
2019-09-15 12:00:00 12:00 night
2019-09-15 13:00:00 13:00 night
2019-09-15 14:00:00 14:00 night
2019-09-15 15:00:00 15:00 night
2019-09-15 16:00:00 16:00 night
2019-09-15 17:00:00 17:00 night
2019-09-15 18:00:00 18:00 night
2019-09-15 19:00:00 19:00 night
In [63]: df.loc[df.between_time("06:30", "18:00").index, "time_desc"] = "day"
In [64]: df
Out[64]:
Time time_desc
Time
2019-09-15 05:00:00 05:00 night
2019-09-15 06:00:00 06:00 night
2019-09-15 07:00:00 07:00 day
2019-09-15 08:00:00 08:00 day
2019-09-15 09:00:00 09:00 day
2019-09-15 10:00:00 10:00 day
2019-09-15 11:00:00 11:00 day
2019-09-15 12:00:00 12:00 day
2019-09-15 13:00:00 13:00 day
2019-09-15 14:00:00 14:00 day
2019-09-15 15:00:00 15:00 day
2019-09-15 16:00:00 16:00 day
2019-09-15 17:00:00 17:00 day
2019-09-15 18:00:00 18:00 day
2019-09-15 19:00:00 19:00 night
各位,
我需要帮助将 pandas 这个数据帧分成夜间和白天数据。让我们假设夜晚在 17:00 之后和 08:30 之前,白天在 08:30 和 17:00 之间。
Date Time Open High Low Close Vol
7 2019-09-02 05:00 11919.9 11929.7 11917.7 11918.9 240
8 2019-09-02 06:00 11920.7 11940.4 11917.7 11927.9 240
9 2019-09-02 07:00 11927.4 11966.2 11927.2 11936.4 240
10 2019-09-02 08:00 11936.9 11955.9 11928.1 11951.4 240
11 2019-09-02 09:00 11951.4 11960.2 11939.4 11954.4 240
12 2019-09-02 10:00 11953.9 11995.9 11951.4 11976.9 240
13 2019-09-02 11:00 11976.7 11979.4 11956.2 11965.9 240
14 2019-09-02 12:00 11966.2 11971.4 11956.4 11965.4 240
15 2019-09-02 13:00 11965.7 11969.7 11943.4 11947.7 240
16 2019-09-02 14:00 11947.4 11962.4 11943.9 11960.7 240
17 2019-09-02 15:00 11960.9 11964.2 11901.2 11934.9 240
18 2019-09-02 16:00 11934.9 11939.7 11921.4 11929.7 240
19 2019-09-02 17:00 11929.9 11940.4 11928.4 11938.2 236
20 2019-09-02 18:00 11937.9 11938.2 11934.7 11938.2 176
21 2019-09-02 19:00 11937.9 11948.7 11937.7 11943.2 196
between_time 仅显示当前日期的时间,因此仅此一项不会显示。
一个想法是将 Time
列转换为时间增量并使用 Series.between
:
mask = (pd.to_timedelta(df['Time'].astype(str).add(':00'))
.between(pd.Timedelta('08:30:00'), pd.Timedelta('17:00:00')))
df1 = df[mask]
print (df1)
Date Time Open High Low Close Vol
11 2019-09-02 09:00 11951.4 11960.2 11939.4 11954.4 240
12 2019-09-02 10:00 11953.9 11995.9 11951.4 11976.9 240
13 2019-09-02 11:00 11976.7 11979.4 11956.2 11965.9 240
14 2019-09-02 12:00 11966.2 11971.4 11956.4 11965.4 240
15 2019-09-02 13:00 11965.7 11969.7 11943.4 11947.7 240
16 2019-09-02 14:00 11947.4 11962.4 11943.9 11960.7 240
17 2019-09-02 15:00 11960.9 11964.2 11901.2 11934.9 240
18 2019-09-02 16:00 11934.9 11939.7 11921.4 11929.7 240
19 2019-09-02 17:00 11929.9 11940.4 11928.4 11938.2 236
df2 = df[~mask]
print (df2)
Date Time Open High Low Close Vol
7 2019-09-02 05:00 11919.9 11929.7 11917.7 11918.9 240
8 2019-09-02 06:00 11920.7 11940.4 11917.7 11927.9 240
9 2019-09-02 07:00 11927.4 11966.2 11927.2 11936.4 240
10 2019-09-02 08:00 11936.9 11955.9 11928.1 11951.4 240
20 2019-09-02 18:00 11937.9 11938.2 11934.7 11938.2 176
21 2019-09-02 19:00 11937.9 11948.7 11937.7 11943.2 196
编辑:
DataFrame.between_time
的另一个想法,但有必要 DatetimeIndex
:
df['Datetime'] = pd.to_datetime(df['Date'].astype(str) + ':' + df['Time'].astype(str))
df = df.set_index('Datetime')
day = df.between_time('09:00','17:00')
night = df[~df.index.isin(day.index)]
我会尝试这样的事情,显然将时间更改为您需要的时间!但这是一般的想法。
In [58]: df = pd.DataFrame({"Time":[
...: "05:00",
...: "06:00",
...: "07:00",
...: "08:00",
...: "09:00",
...: "10:00",
...: "11:00",
...: "12:00",
...: "13:00",
...: "14:00",
...: "15:00",
...: "16:00",
...: "17:00",
...: "18:00",
...: "19:00"]})
In [59]: df = df.set_index(pd.to_datetime(df["Time"]))
In [60]: df
Out[60]:
Time
Time
2019-09-15 05:00:00 05:00
2019-09-15 06:00:00 06:00
2019-09-15 07:00:00 07:00
2019-09-15 08:00:00 08:00
2019-09-15 09:00:00 09:00
2019-09-15 10:00:00 10:00
2019-09-15 11:00:00 11:00
2019-09-15 12:00:00 12:00
2019-09-15 13:00:00 13:00
2019-09-15 14:00:00 14:00
2019-09-15 15:00:00 15:00
2019-09-15 16:00:00 16:00
2019-09-15 17:00:00 17:00
2019-09-15 18:00:00 18:00
2019-09-15 19:00:00 19:00
In [61]: df["time_desc"] = "night"
In [62]: df
Out[62]:
Time time_desc
Time
2019-09-15 05:00:00 05:00 night
2019-09-15 06:00:00 06:00 night
2019-09-15 07:00:00 07:00 night
2019-09-15 08:00:00 08:00 night
2019-09-15 09:00:00 09:00 night
2019-09-15 10:00:00 10:00 night
2019-09-15 11:00:00 11:00 night
2019-09-15 12:00:00 12:00 night
2019-09-15 13:00:00 13:00 night
2019-09-15 14:00:00 14:00 night
2019-09-15 15:00:00 15:00 night
2019-09-15 16:00:00 16:00 night
2019-09-15 17:00:00 17:00 night
2019-09-15 18:00:00 18:00 night
2019-09-15 19:00:00 19:00 night
In [63]: df.loc[df.between_time("06:30", "18:00").index, "time_desc"] = "day"
In [64]: df
Out[64]:
Time time_desc
Time
2019-09-15 05:00:00 05:00 night
2019-09-15 06:00:00 06:00 night
2019-09-15 07:00:00 07:00 day
2019-09-15 08:00:00 08:00 day
2019-09-15 09:00:00 09:00 day
2019-09-15 10:00:00 10:00 day
2019-09-15 11:00:00 11:00 day
2019-09-15 12:00:00 12:00 day
2019-09-15 13:00:00 13:00 day
2019-09-15 14:00:00 14:00 day
2019-09-15 15:00:00 15:00 day
2019-09-15 16:00:00 16:00 day
2019-09-15 17:00:00 17:00 day
2019-09-15 18:00:00 18:00 day
2019-09-15 19:00:00 19:00 night