pandas:当列中的值大于每列中的最后一个值时输出索引值
pandas: output index value when values in a column become greater than last value in each column
全题-
搜索DataFrame中的每一列,以确定何时第一个值大于DataFrame中每列最后一行存储的值并输出索引
例如。 df.head():
Well A1 A2 A3 A4
Temperature
25.0 371.335253 360.026443 253.228769 593.436104
25.2 331.957145 332.224668 233.607595 561.057715
25.4 305.472591 303.777874 213.500582 535.310186
25.6 285.713623 274.069361 202.024427 515.261876
25.8 252.716374 254.610848 181.719415 488.988468
例如。 df.tail():
Well A1 A2 A3 A4
Temperature
94.79 -441.775980 -664.549239 1060.674188 1158.481056
94.99 -492.189733 -709.521424 1029.628209 1087.625128
mean 280.759521 283.417750 201.471571 519.939366
std 72.404373 69.023406 45.447202 58.150127
4*std 570.377014 559.511373 383.260378 752.539875
我想在 A1 (570.37) 中使用 4*std 的值,并从列顶部开始搜索 A1 中大于 (570.37) 的第一个值并输出温度。我需要对所有列重复此操作。
我想将输出作为一个新的数据框,如下面的示例...我不知道如何构建它?
Well Temp
A1 26.0
A2 27.6
A3 26.8
... ...
H12 27.2
如有任何帮助,我将不胜感激!
如果每列的价值始终存在,我相信您需要:
print (df)
A1 A2 A3 A4
Well Temperature
25.0 371.335253 360.026443 253.228769 593.436104
25.2 331.957145 632.224668 233.607595 561.057715
25.4 3005.472591 303.777874 213.500582 535.310186
25.6 285.713623 274.069361 202.024427 515.261876
25.8 252.716374 254.610848 181.719415 488.988468
94.79 -441.775980 -664.549239 1060.674188 1158.481056
94.99 -492.189733 -709.521424 1029.628209 1087.625128
mean 280.759521 283.417750 201.471571 519.939366
std 72.404373 69.023406 45.447202 58.150127
4*std 570.377014 559.511373 383.260378 752.539875
df1 = df.iloc[:-3].gt(df.iloc[-1]).idxmax().rename_axis('Well').reset_index(name='Temp')
print (df1)
Well Temp
0 A1 25.4
1 A2 25.2
2 A3 94.79
3 A4 94.79
详情:
print (df.iloc[:-3].gt(df.iloc[-1]))
A1 A2 A3 A4
Well Temperature
25.0 False False False False
25.2 False True False False
25.4 True False False False
25.6 False False False False
25.8 False False False False
94.79 False False True True
94.99 False False True True
print (df.iloc[:-3].gt(df.iloc[-1]).idxmax())
A1 25.4
A2 25.2
A3 94.79
A4 94.79
dtype: object
如果可能某个值不大于,一种可能的解决方案是在末尾添加新行 NaN
索引:
print (df)
A1 A2 A3 A4
Well Temperature
25.0 371.335253 360.026443 253.228769 593.436104
25.2 331.957145 332.224668 233.607595 561.057715
25.4 3005.472591 303.777874 213.500582 535.310186
25.6 285.713623 274.069361 202.024427 515.261876
25.8 252.716374 254.610848 181.719415 488.988468
94.79 -441.775980 -664.549239 1060.674188 1158.481056
94.99 -492.189733 -709.521424 1029.628209 1087.625128
mean 280.759521 283.417750 201.471571 519.939366
std 72.404373 69.023406 45.447202 58.150127
4*std 570.377014 559.511373 383.260378 752.539875
df1 = df.iloc[:-3].append((df.iloc[-1] + 1).rename(np.nan))
print (df1)
A1 A2 A3 A4
Well Temperature
25.0 371.335253 360.026443 253.228769 593.436104
25.2 331.957145 332.224668 233.607595 561.057715
25.4 3005.472591 303.777874 213.500582 535.310186
25.6 285.713623 274.069361 202.024427 515.261876
25.8 252.716374 254.610848 181.719415 488.988468
94.79 -441.775980 -664.549239 1060.674188 1158.481056
94.99 -492.189733 -709.521424 1029.628209 1087.625128
NaN 571.377014 560.511373 384.260378 753.539875
df2 = df1.gt(df.iloc[-1]).idxmax().rename_axis('Well').reset_index(name='Temp')
print (df2)
Well Temp
0 A1 25.4
1 A2 NaN
2 A3 94.79
3 A4 94.79
print (df1.gt(df.iloc[-1]))
A1 A2 A3 A4
Well Temperature
25.0 False False False False
25.2 False False False False
25.4 True False False False
25.6 False False False False
25.8 False False False False
94.79 False False True True
94.99 False False True True
NaN True True True True
全题-
搜索DataFrame中的每一列,以确定何时第一个值大于DataFrame中每列最后一行存储的值并输出索引
例如。 df.head():
Well A1 A2 A3 A4
Temperature
25.0 371.335253 360.026443 253.228769 593.436104
25.2 331.957145 332.224668 233.607595 561.057715
25.4 305.472591 303.777874 213.500582 535.310186
25.6 285.713623 274.069361 202.024427 515.261876
25.8 252.716374 254.610848 181.719415 488.988468
例如。 df.tail():
Well A1 A2 A3 A4
Temperature
94.79 -441.775980 -664.549239 1060.674188 1158.481056
94.99 -492.189733 -709.521424 1029.628209 1087.625128
mean 280.759521 283.417750 201.471571 519.939366
std 72.404373 69.023406 45.447202 58.150127
4*std 570.377014 559.511373 383.260378 752.539875
我想在 A1 (570.37) 中使用 4*std 的值,并从列顶部开始搜索 A1 中大于 (570.37) 的第一个值并输出温度。我需要对所有列重复此操作。
我想将输出作为一个新的数据框,如下面的示例...我不知道如何构建它?
Well Temp
A1 26.0
A2 27.6
A3 26.8
... ...
H12 27.2
如有任何帮助,我将不胜感激!
如果每列的价值始终存在,我相信您需要:
print (df)
A1 A2 A3 A4
Well Temperature
25.0 371.335253 360.026443 253.228769 593.436104
25.2 331.957145 632.224668 233.607595 561.057715
25.4 3005.472591 303.777874 213.500582 535.310186
25.6 285.713623 274.069361 202.024427 515.261876
25.8 252.716374 254.610848 181.719415 488.988468
94.79 -441.775980 -664.549239 1060.674188 1158.481056
94.99 -492.189733 -709.521424 1029.628209 1087.625128
mean 280.759521 283.417750 201.471571 519.939366
std 72.404373 69.023406 45.447202 58.150127
4*std 570.377014 559.511373 383.260378 752.539875
df1 = df.iloc[:-3].gt(df.iloc[-1]).idxmax().rename_axis('Well').reset_index(name='Temp')
print (df1)
Well Temp
0 A1 25.4
1 A2 25.2
2 A3 94.79
3 A4 94.79
详情:
print (df.iloc[:-3].gt(df.iloc[-1]))
A1 A2 A3 A4
Well Temperature
25.0 False False False False
25.2 False True False False
25.4 True False False False
25.6 False False False False
25.8 False False False False
94.79 False False True True
94.99 False False True True
print (df.iloc[:-3].gt(df.iloc[-1]).idxmax())
A1 25.4
A2 25.2
A3 94.79
A4 94.79
dtype: object
如果可能某个值不大于,一种可能的解决方案是在末尾添加新行 NaN
索引:
print (df)
A1 A2 A3 A4
Well Temperature
25.0 371.335253 360.026443 253.228769 593.436104
25.2 331.957145 332.224668 233.607595 561.057715
25.4 3005.472591 303.777874 213.500582 535.310186
25.6 285.713623 274.069361 202.024427 515.261876
25.8 252.716374 254.610848 181.719415 488.988468
94.79 -441.775980 -664.549239 1060.674188 1158.481056
94.99 -492.189733 -709.521424 1029.628209 1087.625128
mean 280.759521 283.417750 201.471571 519.939366
std 72.404373 69.023406 45.447202 58.150127
4*std 570.377014 559.511373 383.260378 752.539875
df1 = df.iloc[:-3].append((df.iloc[-1] + 1).rename(np.nan))
print (df1)
A1 A2 A3 A4
Well Temperature
25.0 371.335253 360.026443 253.228769 593.436104
25.2 331.957145 332.224668 233.607595 561.057715
25.4 3005.472591 303.777874 213.500582 535.310186
25.6 285.713623 274.069361 202.024427 515.261876
25.8 252.716374 254.610848 181.719415 488.988468
94.79 -441.775980 -664.549239 1060.674188 1158.481056
94.99 -492.189733 -709.521424 1029.628209 1087.625128
NaN 571.377014 560.511373 384.260378 753.539875
df2 = df1.gt(df.iloc[-1]).idxmax().rename_axis('Well').reset_index(name='Temp')
print (df2)
Well Temp
0 A1 25.4
1 A2 NaN
2 A3 94.79
3 A4 94.79
print (df1.gt(df.iloc[-1]))
A1 A2 A3 A4
Well Temperature
25.0 False False False False
25.2 False False False False
25.4 True False False False
25.6 False False False False
25.8 False False False False
94.79 False False True True
94.99 False False True True
NaN True True True True