Python 使用最大值和最小值优化数据框中的循环
Python optimization of loop in data frame with max and min values
我有疑问如何优化我的代码,实际上只有循环。我用来计算最多两行的解决方案,有时最多计算行数。
我尝试使用 .loc 和 .clip 更改我的代码,但是当它多次出现关于 max 或 min 时,我在逻辑表达式方面遇到了一些麻烦。
它在看开头:
def Calc(row):
if row['Forecast'] == 0:
return max(row['Qty'],0)
elif row['def'] == 1:
return 0
elif row['def'] == 0:
return round(max(row['Qty'] - ( max(row['Forecast_total']*14,(row['Qty_12m_1']+row['Qty_12m_2'])) * max(1, (row['Total']/row['Forecast'])/54)),0 ))
df['Calc'] = df.apply(Calc, axis=1)
我设法使用我指出的函数来更改它,但是我在如何编写这个 max(max())
时遇到了问题
df.loc[(combined_sf2['Forecast'] == 0),'Calc'] = df.clip(0,None)
df.loc[(combined_sf2['def'] == 1),'Calc'] = 0
df.loc[(combined_sf2['def'] == 0),'Calc'] = round(max(df['Qty']- (max(df['Forecast_total']
*14,(df['Qty_12m_1']+df['Qty_12m_2']))
*max(1, (df['Total']/df['Forecast'])/54)),0))
前两个功能有效,最后一个无效。
id Forecast def Calc Qty Forecast_total Qty_12m_1 Qty_12m_2 Total
31551 0 0 0 2 0 0 0 95
27412 0,1 0 1 3 0,1 11 0 7
23995 0,1 0 0 4 0 1 0 7
27411 5,527 1 0,036186 60 0,2 64 0 183
28902 5,527 0 0,963814 33 5,327 277 0 183
23954 5,527 0 0 6 0 6 0 183
23994 5,527 0 0 8 0 0 0 183
31549 5,527 0 0 6 0 1 0 183
31550 5,527 0 0 6 0 10 0 183
使用numpy.select
and instead max
use numpy.maximum
:
m1 = df['Forecast'] == 0
m2 = df['def'] == 1
m3 = df['def'] == 0
s1 = df['Qty'].clip(lower=0)
s3 = round(np.maximum(df['Qty'] - (np.maximum(df['Forecast_total']*14,(df['Qty_12m_1']+df['Qty_12m_2'])) * np.maximum(1, (df['Total']/df['Forecast'])/54)),0 ))
df['Calc2'] = np.select([m1, m2, m3], [s1, 0, s3], default=None)
我有疑问如何优化我的代码,实际上只有循环。我用来计算最多两行的解决方案,有时最多计算行数。
我尝试使用 .loc 和 .clip 更改我的代码,但是当它多次出现关于 max 或 min 时,我在逻辑表达式方面遇到了一些麻烦。
它在看开头:
def Calc(row):
if row['Forecast'] == 0:
return max(row['Qty'],0)
elif row['def'] == 1:
return 0
elif row['def'] == 0:
return round(max(row['Qty'] - ( max(row['Forecast_total']*14,(row['Qty_12m_1']+row['Qty_12m_2'])) * max(1, (row['Total']/row['Forecast'])/54)),0 ))
df['Calc'] = df.apply(Calc, axis=1)
我设法使用我指出的函数来更改它,但是我在如何编写这个 max(max())
时遇到了问题df.loc[(combined_sf2['Forecast'] == 0),'Calc'] = df.clip(0,None)
df.loc[(combined_sf2['def'] == 1),'Calc'] = 0
df.loc[(combined_sf2['def'] == 0),'Calc'] = round(max(df['Qty']- (max(df['Forecast_total']
*14,(df['Qty_12m_1']+df['Qty_12m_2']))
*max(1, (df['Total']/df['Forecast'])/54)),0))
前两个功能有效,最后一个无效。
id Forecast def Calc Qty Forecast_total Qty_12m_1 Qty_12m_2 Total
31551 0 0 0 2 0 0 0 95
27412 0,1 0 1 3 0,1 11 0 7
23995 0,1 0 0 4 0 1 0 7
27411 5,527 1 0,036186 60 0,2 64 0 183
28902 5,527 0 0,963814 33 5,327 277 0 183
23954 5,527 0 0 6 0 6 0 183
23994 5,527 0 0 8 0 0 0 183
31549 5,527 0 0 6 0 1 0 183
31550 5,527 0 0 6 0 10 0 183
使用numpy.select
and instead max
use numpy.maximum
:
m1 = df['Forecast'] == 0
m2 = df['def'] == 1
m3 = df['def'] == 0
s1 = df['Qty'].clip(lower=0)
s3 = round(np.maximum(df['Qty'] - (np.maximum(df['Forecast_total']*14,(df['Qty_12m_1']+df['Qty_12m_2'])) * np.maximum(1, (df['Total']/df['Forecast'])/54)),0 ))
df['Calc2'] = np.select([m1, m2, m3], [s1, 0, s3], default=None)