在另一列的列表中查找 pandas 数据框列的最接近元素
Find closest element of a pandas dataframe column in another column's list
我有以下数据框:
A = [3,38,124]
B = [[0,0,1,7,34,76,4,15,28,8,7,8,200,108,7],[0,0,1,7,34],
[4,109,71,257,3,3,7,1,0,0,7,8,100,148,54,3,134,90,23,43,17]]
df = pd.DataFrame({'A':A,
'B':B})
df
B 列包含列表作为元素。我想创建一个新列,其中包含与 B 的相应列表中包含的列 A 最接近的元素。
期望的输出:
A = [3,38,124]
B = [[0,0,1,7,34,76,4,15,28,8,7,8,200,108,7],[0,0,1,7,34],
[4,109,71,257,3,3,7,1,0,0,7,8,100,148,54,3,134,90,23,43,17]]
Desired_output=[4,34,134]
df_out = pd.DataFrame({'A':A,
'B':B,
'Desired_output':Desired_output})
df_out=df_out [['A','B','Desired_output']]
df_out
在将其放入 DataFrame
之前尝试这样做:
C = [B[i][np.argmin(np.abs(np.array(B[i]) - A[i]))] for i in range(len(A))]
df = pd.DataFrame({'A':A,
'B':B, 'Closest':C})
输出为:
A B Closest
0 3 [0, 0, 1, 7, 34, 76, 4, 15, 28, 8, 7, 8, 200, ... 4
1 38 [0, 0, 1, 7, 34] 34
2 124 [4, 109, 71, 257, 3, 3, 7, 1, 0, 0, 7, 8, 100,... 134
要完成前面的回答,如果您想在将数据放入 DataFrame
后执行此操作,请使用 DataFrame.apply
函数,如下所示:
import pandas as pd
import numpy as np
A = [3, 38, 124]
B = [[0, 0, 1, 7, 34, 76, 4, 15, 28, 8, 7, 8, 200, 108, 7],
[0, 0, 1, 7, 34],
[4, 109, 71, 257, 3, 3, 7, 1, 0, 0, 7, 8, 100, 148, 54, 3, 134, 90, 23, 43, 17]]
df = pd.DataFrame({'A': A, 'B': B})
def find_nearest(row):
return row["B"][np.argmin([abs(candidate-row["A"]) for candidate in row["B"]])]
df["desired_output"] = df.apply(find_nearest, axis=1)
print(df)
我有以下数据框:
A = [3,38,124]
B = [[0,0,1,7,34,76,4,15,28,8,7,8,200,108,7],[0,0,1,7,34],
[4,109,71,257,3,3,7,1,0,0,7,8,100,148,54,3,134,90,23,43,17]]
df = pd.DataFrame({'A':A,
'B':B})
df
B 列包含列表作为元素。我想创建一个新列,其中包含与 B 的相应列表中包含的列 A 最接近的元素。
期望的输出:
A = [3,38,124]
B = [[0,0,1,7,34,76,4,15,28,8,7,8,200,108,7],[0,0,1,7,34],
[4,109,71,257,3,3,7,1,0,0,7,8,100,148,54,3,134,90,23,43,17]]
Desired_output=[4,34,134]
df_out = pd.DataFrame({'A':A,
'B':B,
'Desired_output':Desired_output})
df_out=df_out [['A','B','Desired_output']]
df_out
在将其放入 DataFrame
之前尝试这样做:
C = [B[i][np.argmin(np.abs(np.array(B[i]) - A[i]))] for i in range(len(A))]
df = pd.DataFrame({'A':A,
'B':B, 'Closest':C})
输出为:
A B Closest
0 3 [0, 0, 1, 7, 34, 76, 4, 15, 28, 8, 7, 8, 200, ... 4
1 38 [0, 0, 1, 7, 34] 34
2 124 [4, 109, 71, 257, 3, 3, 7, 1, 0, 0, 7, 8, 100,... 134
要完成前面的回答,如果您想在将数据放入 DataFrame
后执行此操作,请使用 DataFrame.apply
函数,如下所示:
import pandas as pd
import numpy as np
A = [3, 38, 124]
B = [[0, 0, 1, 7, 34, 76, 4, 15, 28, 8, 7, 8, 200, 108, 7],
[0, 0, 1, 7, 34],
[4, 109, 71, 257, 3, 3, 7, 1, 0, 0, 7, 8, 100, 148, 54, 3, 134, 90, 23, 43, 17]]
df = pd.DataFrame({'A': A, 'B': B})
def find_nearest(row):
return row["B"][np.argmin([abs(candidate-row["A"]) for candidate in row["B"]])]
df["desired_output"] = df.apply(find_nearest, axis=1)
print(df)