如何遍历数据框的列?

How to iterate through columns of the dataframe?

我想遍历数据框的所有列。这样我就可以得到该列的特定数据,使用这些数据我必须为另一个数据框计算。 我这里有 :

                         DP1         DP2         DP3         DP4         DP5         DP6         DP7         DP8        DP9       DP10       Total
OP1                  357848.0   1124788.0   1735330.0   2218270.0   2745596.0   3319994.0   3466336.0   3606286.0  3833515.0  3901463.0   3901463.0
OP2                  352118.0   1236139.0   2170033.0   3353322.0   3799067.0   4120063.0   4647867.0   4914039.0  5339085.0        NaN   5339085.0
OP3                  290507.0   1292306.0   2218525.0   3235179.0   3985995.0   4132918.0   4628910.0   4909315.0        NaN        NaN   4909315.0
OP4                  310608.0   1418858.0   2195047.0   3757447.0   4029929.0   4381982.0   4588268.0         NaN        NaN        NaN   4588268.0
OP5                  443160.0   1136350.0   2128333.0   2897821.0   3402672.0   3873311.0         NaN         NaN        NaN        NaN   3873311.0
OP6                  396132.0   1333217.0   2180715.0   2985752.0   3691712.0         NaN         NaN         NaN        NaN        NaN   3691712.0
OP7                  440832.0   1288463.0   2419861.0   3483130.0         NaN         NaN         NaN         NaN        NaN        NaN   3483130.0
OP8                  359480.0   1421128.0   2864498.0         NaN         NaN         NaN         NaN         NaN        NaN        NaN   2864498.0
OP9                  376686.0   1363294.0         NaN         NaN         NaN         NaN         NaN         NaN        NaN        NaN   1363294.0
OP10                 344014.0         NaN         NaN         NaN         NaN         NaN         NaN         NaN        NaN        NaN    344014.0
Total               3671385.0  11614543.0  17912342.0  21930921.0  21654971.0  19828268.0  17331381.0  13429640.0  9172600.0  3901463.0  34358090.0
Latest Observation   344014.0   1363294.0   2864498.0   3483130.0   3691712.0   3873311.0   4588268.0   4909315.0  5339085.0  3901463.0         NaN 

从这个 table 我想计算这个公式这个公式:在 DP1 列中,Total/Last 观察,这个答案是除以 DP2 列总数。像这样我们必须计算所有列并将其保存在另一个数据框中。

我们需要这样的行:

Weighted Average     3.491   1.747   1.457   1.174   1.104   1.086   1.054   1.077   1.018 

我们试过的代码:

LDFTriangledf['Weighted Average'] =CumulativePaidTriangledf.loc['Total','DP2']/(CumulativePaidTriangledf.loc['Total','DP1'] - CumulativePaidTriangledf.loc['Latest Observation','DP1'])

您可以从 .loc 中删除列名,只删除 shift(-1, axis=1) 以获得下一列的 Total。这使您可以在一次操作中将公式应用于所有列:

CumulativePaidTriangledf.shift(-1, axis=1).loc['Total'] / (CumulativePaidTriangledf.loc['Total'] - CumulativePaidTriangledf.loc['Latest Observation'])

# DP1      3.490607
# DP2      1.747333
# DP3      1.457413
# DP4      1.173852
# DP5      1.103824
# DP6      1.086269
# DP7      1.053874
# DP8      1.076555
# DP9      1.017725
# DP10          inf
# Total         NaN
# dtype: float64

以下是三个组件的作用的细分:

DP1 DP2 DP3 DP4 DP5 DP6 DP7 DP8 DP9 DP10 Total
A: .shift(-1, axis=1).loc['Total'] -- We are shifting the whole Total row to the left, so every column now has the next Total value. 1.161454e+07 1.791234e+07 2.193092e+07 2.165497e+07 1.982827e+07 1.733138e+07 1.342964e+07 9.172600e+06 3.901463e+06 34358090.0 NaN
B: .loc['Total'] -- This is the normal Total row. 3.671385e+06 1.161454e+07 1.791234e+07 2.193092e+07 2.165497e+07 1.982827e+07 1.733138e+07 1.342964e+07 9.172600e+06 3901463.0 34358090.0
C: .loc['Latest Observation'] -- This is the normal Latest Observation. 3.440140e+05 1.363294e+06 2.864498e+06 3.483130e+06 3.691712e+06 3.873311e+06 4.588268e+06 4.909315e+06 5.339085e+06 3901463.0 NaN
A / (B-C) -- This is what the code above does. It takes the shifted Total row (A) and divides it by the difference of the current Total row (B) and current Latest observation row (C). 3.490607 1.747333 1.457413 1.173852 1.103824 1.086269 1.053874 1.076555 1.017725 inf NaN