基于数据框中其他列的值执行计算的最快方法是什么？

Question

我有 df，我必须应用这个公式：

到每一行，然后添加新系列（作为新列）。

现在我的代码是：

 new_col = deque()
    for i in range(len(df)):
        if i < n:
            new_col.append(0)
        else:
            x = np.log10(np.sum(ATR[i-n:i])/(max(high[i-n:i])-min(low[i-n:i])))
            y = np.log10(n)
            new_col.append(100 * x/y)
    df['new_col'] = pd.DataFrame({"new_col" : new_col})

ATR,high,low是从我现有的df的列中得到的。但是这种方法很慢。有没有更快的方法来执行任务？谢谢

Answer 1

没有示例数据，我无法测试以下内容，但它应该可以工作：

tmp_df = df.rolling(n).agg({'High':'max', 'Low':'min', 'ATR':'sum'})

df['new_col'] = (100*np.log10(tmp_df['ATR'])) / (tmp_df['High'] - tmp_df['Low']) / np.log10(n)

df['new_col'] = df['new_col'].shift().fillna(0)

基于数据框中其他列的值执行计算的最快方法是什么？

What's fastest way to perform calculation based on values of other columns in a dataframe?

python

numpy

logarithm

arithmetic-expressions

dataframe