将特定条件下不同长度的列表添加到日期时间索引 pandas 数据帧
Adding a list of different length under a certain condition to a date time index pandas dataframe
如果我想在某个值 > 某个数字之后插入列表,如何在具有日期时间索引的数据框的某个位置插入值列表?示例如下:
import pandas as pd
example_list = [2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5]
example_index = pd.date_range('2022-01-01', periods=120, freq='1min')
example_df = pd.DataFrame({'example values': np.arange(120)})
example_df.index = example_index
example_df
输出:
example values
2022-01-01 00:00:00 0
2022-01-01 00:01:00 1
2022-01-01 00:02:00 2
2022-01-01 00:03:00 3
2022-01-01 00:04:00 4
... ...
2022-01-01 01:55:00 115
2022-01-01 01:56:00 116
2022-01-01 01:57:00 117
2022-01-01 01:58:00 118
2022-01-01 01:59:00 119
我想插入 example_list
作为一个名为
"example_values_2" 在 example_values>20 的位置。这可能吗?
IIUC,可以找到这个值和切片的索引:
start = example_df['example values'].gt(20).argmax()
idx = example_df.index
example_df.loc[idx[start:start+len(example_list)], 'example_values_2'] = example_list
输出:
example values example_values_2
... ... ...
2022-01-01 00:20:00 20 NaN
2022-01-01 00:21:00 21 2.0
2022-01-01 00:22:00 22 2.0
2022-01-01 00:23:00 23 2.0
2022-01-01 00:24:00 24 2.0
2022-01-01 00:25:00 25 3.0
2022-01-01 00:26:00 26 3.0
2022-01-01 00:27:00 27 3.0
2022-01-01 00:28:00 28 3.0
2022-01-01 00:29:00 29 4.0
2022-01-01 00:30:00 30 4.0
2022-01-01 00:31:00 31 4.0
2022-01-01 00:32:00 32 4.0
2022-01-01 00:33:00 33 5.0
2022-01-01 00:34:00 34 5.0
2022-01-01 00:35:00 35 5.0
2022-01-01 00:36:00 36 NaN
... ... ...
如果我想在某个值 > 某个数字之后插入列表,如何在具有日期时间索引的数据框的某个位置插入值列表?示例如下:
import pandas as pd
example_list = [2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5]
example_index = pd.date_range('2022-01-01', periods=120, freq='1min')
example_df = pd.DataFrame({'example values': np.arange(120)})
example_df.index = example_index
example_df
输出:
example values
2022-01-01 00:00:00 0
2022-01-01 00:01:00 1
2022-01-01 00:02:00 2
2022-01-01 00:03:00 3
2022-01-01 00:04:00 4
... ...
2022-01-01 01:55:00 115
2022-01-01 01:56:00 116
2022-01-01 01:57:00 117
2022-01-01 01:58:00 118
2022-01-01 01:59:00 119
我想插入 example_list
作为一个名为
"example_values_2" 在 example_values>20 的位置。这可能吗?
IIUC,可以找到这个值和切片的索引:
start = example_df['example values'].gt(20).argmax()
idx = example_df.index
example_df.loc[idx[start:start+len(example_list)], 'example_values_2'] = example_list
输出:
example values example_values_2
... ... ...
2022-01-01 00:20:00 20 NaN
2022-01-01 00:21:00 21 2.0
2022-01-01 00:22:00 22 2.0
2022-01-01 00:23:00 23 2.0
2022-01-01 00:24:00 24 2.0
2022-01-01 00:25:00 25 3.0
2022-01-01 00:26:00 26 3.0
2022-01-01 00:27:00 27 3.0
2022-01-01 00:28:00 28 3.0
2022-01-01 00:29:00 29 4.0
2022-01-01 00:30:00 30 4.0
2022-01-01 00:31:00 31 4.0
2022-01-01 00:32:00 32 4.0
2022-01-01 00:33:00 33 5.0
2022-01-01 00:34:00 34 5.0
2022-01-01 00:35:00 35 5.0
2022-01-01 00:36:00 36 NaN
... ... ...