如何将包含数据和 datetime64[ns] 的列表与带有 datetime64[ns] 索引的 pandas 数据框合并
How to merge a list containing data and datetime64[ns] with a pandas dataframe with datetime64[ns] index
我想从 dataframe
data
中读取两列 S1_max 和 S2_max。无论 S1_max 列中出现什么值,我都想检查每个 S1_max
是否由相应的 S2_max
信号后继。如果是这样,我计算 S1_max
和 S2_max
信号之间的时间增量。然后在单独的 dict
d
中 S2_max 列的 datetime[64ns]
索引处索引此结果,然后将其附加到 list
delta_data
.如何将此结果添加到相应 datetime[64ns]
索引处的现有 data
数据框?
这是我的创作 delta_data
:
#time between each S2 global maxima: 86 ns/samp freq 200 = 0.43 ns
#Checking that each S1 is succeeded by a corresponging S2 signal and calculating the time delta:
delta_data = []
diff_S1 = 0
diff_S2 = 0
i = 0
while((i + diff_S1 + 1 < len(peak_indexes_S1)) and (i + diff_S2<len(peak_indexes_S2))):
# Find next ppg peak after S1 peak
while (df["S2"].index[peak_indexes_S2[i + diff_S2]] < df["S1"].index[peak_indexes_S1[i+diff_S1]]):
diff_S2=diff_S2+1
while (df["S1"].index[peak_indexes_S1[i+diff_S1+1]] < df["S2"].index[peak_indexes_S2[i + diff_S2]]):
diff_S1=diff_S1+1
i_peak_S2 = peak_indexes_S2[i + diff_S2]
i_peak_S1 = peak_indexes_S1[i + diff_S1]
d={}
d["td"] = (df["S2"].index[i_peak_S2]-df["S1"].index[i_peak_S1]).microseconds
d["time"] = df["S2"].index[i_peak_S2]
PATdata.append(d)
i = i + 1
time_delta=pd.DataFrame(delta_data)
delta_data
打印出来:
td time
0 355000 2019-08-07 13:06:31.010
1 355000 2019-08-07 13:06:31.850
2 355000 2019-08-07 13:06:32.695
这是我的 data
数据框:
l1 l2 l3 l4 S1 S2 S2_max S1_max
2019-08-07 13:11:21.485 0.572720 0.353433 0.701320 1.418840 4.939690 2.858326 2.858326 NaN
2019-08-07 13:11:21.490 0.572807 0.353526 0.701593 1.419052 4.939804 2.854604 NaN 4.939804
此数据框由以下人员创建:
data = pd.read_csv('file.txt')
data.columns = ['l1','l2','l3','l4','S1','S2']
nbrMeasurments = sum(1 for line in open('file.txt'))
data.index = pd.date_range('2019-08-07 13:06:30'), periods=nbrMeasurments-1, freq="5L")
我试过DataFrame.combine_first
和append
。
此外,尝试向 data
添加另一个数据帧时也会出现同样的问题。此数据框在日期时间范围内没有毫秒:
S3 S4
Date
2019-08-07 13:06:30 111 61
据我所知,您正在尝试将另一列附加到现有的 DataFrame。
这里是怎么做的:
df1 = pd.DataFrame({'names':['bla', 'blah', 'blahh'], 'values':[1,2,3]})
df2_to_concat = pd.DataFrame({'put_me_as_a_new_column':['row1', 'row2', 'row3']})
pd.concat([df1.reset_index(drop=True), df2_to_concat.reset_index(drop=True)], axis=1)
reset_index(drop=True)
确保您不会生成 NaN 或重复的索引列。
我想从 dataframe
data
中读取两列 S1_max 和 S2_max。无论 S1_max 列中出现什么值,我都想检查每个 S1_max
是否由相应的 S2_max
信号后继。如果是这样,我计算 S1_max
和 S2_max
信号之间的时间增量。然后在单独的 dict
d
中 S2_max 列的 datetime[64ns]
索引处索引此结果,然后将其附加到 list
delta_data
.如何将此结果添加到相应 datetime[64ns]
索引处的现有 data
数据框?
这是我的创作 delta_data
:
#time between each S2 global maxima: 86 ns/samp freq 200 = 0.43 ns
#Checking that each S1 is succeeded by a corresponging S2 signal and calculating the time delta:
delta_data = []
diff_S1 = 0
diff_S2 = 0
i = 0
while((i + diff_S1 + 1 < len(peak_indexes_S1)) and (i + diff_S2<len(peak_indexes_S2))):
# Find next ppg peak after S1 peak
while (df["S2"].index[peak_indexes_S2[i + diff_S2]] < df["S1"].index[peak_indexes_S1[i+diff_S1]]):
diff_S2=diff_S2+1
while (df["S1"].index[peak_indexes_S1[i+diff_S1+1]] < df["S2"].index[peak_indexes_S2[i + diff_S2]]):
diff_S1=diff_S1+1
i_peak_S2 = peak_indexes_S2[i + diff_S2]
i_peak_S1 = peak_indexes_S1[i + diff_S1]
d={}
d["td"] = (df["S2"].index[i_peak_S2]-df["S1"].index[i_peak_S1]).microseconds
d["time"] = df["S2"].index[i_peak_S2]
PATdata.append(d)
i = i + 1
time_delta=pd.DataFrame(delta_data)
delta_data
打印出来:
td time
0 355000 2019-08-07 13:06:31.010
1 355000 2019-08-07 13:06:31.850
2 355000 2019-08-07 13:06:32.695
这是我的 data
数据框:
l1 l2 l3 l4 S1 S2 S2_max S1_max
2019-08-07 13:11:21.485 0.572720 0.353433 0.701320 1.418840 4.939690 2.858326 2.858326 NaN
2019-08-07 13:11:21.490 0.572807 0.353526 0.701593 1.419052 4.939804 2.854604 NaN 4.939804
此数据框由以下人员创建:
data = pd.read_csv('file.txt')
data.columns = ['l1','l2','l3','l4','S1','S2']
nbrMeasurments = sum(1 for line in open('file.txt'))
data.index = pd.date_range('2019-08-07 13:06:30'), periods=nbrMeasurments-1, freq="5L")
我试过DataFrame.combine_first
和append
。
此外,尝试向 data
添加另一个数据帧时也会出现同样的问题。此数据框在日期时间范围内没有毫秒:
S3 S4
Date
2019-08-07 13:06:30 111 61
据我所知,您正在尝试将另一列附加到现有的 DataFrame。
这里是怎么做的:
df1 = pd.DataFrame({'names':['bla', 'blah', 'blahh'], 'values':[1,2,3]})
df2_to_concat = pd.DataFrame({'put_me_as_a_new_column':['row1', 'row2', 'row3']})
pd.concat([df1.reset_index(drop=True), df2_to_concat.reset_index(drop=True)], axis=1)
reset_index(drop=True)
确保您不会生成 NaN 或重复的索引列。