Python：如何使用 for 循环将列添加到包含其他一些列的日志的数据框中？

Question

我有一个给定的数据框，其中三列包含随机数。我想要做的是使用 for 循环添加另外三列，其中包含相应列的对数值。

我的数据框由：

    K           L           Y
0   44.482983   22.612093   19.160614
1   44.131591   21.071627   44.804061
2   46.188112   21.420053   10.296304
3   38.231555   23.777519   19.128269
4   40.289477   32.450482   23.141743
...     ...     ...     ...
99995   48.793839   33.907988   35.769701
99996   41.654043   34.899131   14.866854
99997   49.602684   20.047823   11.387398
99998   47.265013   30.397463   36.708146
99999   49.375947   39.978109   45.814494
100000 rows × 3 columns

100000 行 × 3 列

使用以下几行给出了我想要的结果：

data['k'] = np.log(data['K'])
data['l'] = np.log(data['L'])
data['y'] = np.log(data['Y'])

生成的数据框如下所示：

    K           L           Y           k           l           y
0   44.482983   22.612093   19.160614   3.795107    3.118485    2.952857
1   44.131591   21.071627   44.804061   3.787176    3.047927    3.802299
2   46.188112   21.420053   10.296304   3.832722    3.064328    2.331785
3   38.231555   23.777519   19.128269   3.643661    3.168741    2.951167
4   40.289477   32.450482   23.141743   3.696090    3.479715    3.141638
...     ...     ...     ...     ...     ...     ...
99995   48.793839   33.907988   35.769701   3.887604    3.523651    3.577101
99996   41.654043   34.899131   14.866854   3.729398    3.552462    2.699134
99997   49.602684   20.047823   11.387398   3.904045    2.998121    2.432507
99998   47.265013   30.397463   36.708146   3.855770    3.414359    3.602999
99999   49.375947   39.978109   45.814494   3.899463    3.688332    3.824600

100000 rows × 6 columns

我尝试的是...

for i in ['k', 'l', 'y']:
    for j in ['K', 'L', 'Y']:
        data[i] = np.log(data[j])

...但这只会添加三列，其中包含 'K'.

的日志

for 循环中我的错误在哪里？

Answer 1

你可以像这样使用单线：

df[['k','l','y']] = np.log(df)

通过索引过滤：

df[['k','l','y']] = np.log(df.iloc[:,:3])

select 姓名：

df[['k','l','y']] = np.log(df[['K','L','Y']])

           K          L          Y         k         l         y
0  44.482983  22.612093  19.160614  3.795107  3.118485  2.952857
1  44.131591  21.071627  44.804061  3.787176  3.047927  3.802299
2  46.188112  21.420053  10.296304  3.832722  3.064328  2.331785
3  38.231555  23.777519  19.128269  3.643661  3.168741  2.951167
4  40.289477  32.450482  23.141743  3.696090  3.479715  3.141638

Answer 2

你的错误是你必须在同一个循环中同时迭代两个数组，而不是在两个嵌套循环中。

zip 命令在这里很有用 context.The 以下代码应该可以工作：

for i,j in zip(['k', 'l', 'y'],['K', 'L', 'Y']):
    data[i] = np.log(data[j])

Python：如何使用 for 循环将列添加到包含其他一些列的日志的数据框中？

Python: How can I add columns to a dataframe containing the log of some other columns using a for loop?

python

pandas

dataframe

for-loop

logarithm