python 匹配值在公差范围内
python match value within tolerance
我正在尝试使用公差值内不同列的值来匹配数据框中一列的值。
我有 2 个数据框:
Dp y_escape_ave(m)
0 [Series 1 at injection 12 1] -0.015850
1 [Series 2 at injection 03 1] -0.037345
2 [Series 1 at injection 06 1] -0.037497
3 [Series 4 at injection 18 1] -0.012622
4 [Series 5 at injection 21 1] NaN
5 [Series 6 at injection 24 1] -0.008801
6 [Series 7 at injection 27 1] -0.008711
v(m/s) y(m)
0 0.000001 -0.007100
1 0.000001 -0.007131
2 0.000001 -0.007161
3 0.000001 -0.007192
4 60.012138 -0.007223
.. ... ...
917 26.700808 -0.037577
918 26.764549 -0.037608
919 26.833567 -0.037639
920 26.889654 -0.037669
921 26.371773 -0.037700
我正在尝试将第一个数据帧的 y_escape_ave 值大致(在一定公差范围内 - y_tol)与第二个数据帧的值 y(m) 列相匹配,然后添加从 v(m/s) 列到 y_escape_ave(m) 值的对应值。我的想法是做一些类似于 Excels INDEX(MATCH;;-1) 方法的事情,但我无法让它工作。
到目前为止我的代码是:
vel_escape = []
vel_escape_temp = [[] for j in range(0,len(df_results.index)-1)]
for i in range(0, len(df_results.index)-1):
for ii in range(0, len(df_vel_filt.index)-1):
if df_results["y_escape_ave(m)"][i] == "":
continue
else:
if abs(abs(df_results["y_escape_ave(m)"][i]) - abs(df_vel_filt["y(m)"][ii])) < y_tol:
vel_escape_temp[i].append(df_vel_filt["v(m/s)"][ii])
if len(vel_escape_temp[i]) <= 1:
vel_escape.append(vel_escape_temp[i][0])
else:
vel_escape.append(statistics.mean(vel_escape_temp[i]))
有没有更简单的方法?
你可以试试pandas.merge_asof
y_tol = None
df1['v(m/s)'] = pd.merge_asof(df1.sort_values('y_escape_ave(m)').fillna(0), df2.sort_values('y(m)'),
left_on='y_escape_ave(m)', right_on='y(m)', tolerance=y_tol)['v(m/s)']
print(df1)
Dp y_escape_ave(m) v(m/s)
0 [Series 1 at injection 12 1] -0.015850 26.700808
1 [Series 2 at injection 03 1] -0.037345 26.700808
2 [Series 1 at injection 06 1] -0.037497 26.700808
3 [Series 4 at injection 18 1] -0.012622 26.700808
4 [Series 5 at injection 21 1] NaN 26.700808
5 [Series 6 at injection 24 1] -0.008801 26.700808
6 [Series 7 at injection 27 1] -0.008711 0.000001
我正在尝试使用公差值内不同列的值来匹配数据框中一列的值。 我有 2 个数据框:
Dp y_escape_ave(m)
0 [Series 1 at injection 12 1] -0.015850
1 [Series 2 at injection 03 1] -0.037345
2 [Series 1 at injection 06 1] -0.037497
3 [Series 4 at injection 18 1] -0.012622
4 [Series 5 at injection 21 1] NaN
5 [Series 6 at injection 24 1] -0.008801
6 [Series 7 at injection 27 1] -0.008711
v(m/s) y(m)
0 0.000001 -0.007100
1 0.000001 -0.007131
2 0.000001 -0.007161
3 0.000001 -0.007192
4 60.012138 -0.007223
.. ... ...
917 26.700808 -0.037577
918 26.764549 -0.037608
919 26.833567 -0.037639
920 26.889654 -0.037669
921 26.371773 -0.037700
我正在尝试将第一个数据帧的 y_escape_ave 值大致(在一定公差范围内 - y_tol)与第二个数据帧的值 y(m) 列相匹配,然后添加从 v(m/s) 列到 y_escape_ave(m) 值的对应值。我的想法是做一些类似于 Excels INDEX(MATCH;;-1) 方法的事情,但我无法让它工作。
到目前为止我的代码是:
vel_escape = []
vel_escape_temp = [[] for j in range(0,len(df_results.index)-1)]
for i in range(0, len(df_results.index)-1):
for ii in range(0, len(df_vel_filt.index)-1):
if df_results["y_escape_ave(m)"][i] == "":
continue
else:
if abs(abs(df_results["y_escape_ave(m)"][i]) - abs(df_vel_filt["y(m)"][ii])) < y_tol:
vel_escape_temp[i].append(df_vel_filt["v(m/s)"][ii])
if len(vel_escape_temp[i]) <= 1:
vel_escape.append(vel_escape_temp[i][0])
else:
vel_escape.append(statistics.mean(vel_escape_temp[i]))
有没有更简单的方法?
你可以试试pandas.merge_asof
y_tol = None
df1['v(m/s)'] = pd.merge_asof(df1.sort_values('y_escape_ave(m)').fillna(0), df2.sort_values('y(m)'),
left_on='y_escape_ave(m)', right_on='y(m)', tolerance=y_tol)['v(m/s)']
print(df1)
Dp y_escape_ave(m) v(m/s)
0 [Series 1 at injection 12 1] -0.015850 26.700808
1 [Series 2 at injection 03 1] -0.037345 26.700808
2 [Series 1 at injection 06 1] -0.037497 26.700808
3 [Series 4 at injection 18 1] -0.012622 26.700808
4 [Series 5 at injection 21 1] NaN 26.700808
5 [Series 6 at injection 24 1] -0.008801 26.700808
6 [Series 7 at injection 27 1] -0.008711 0.000001