在特定 pandas 列中应用矩阵乘积

Question

我有一个 pandas DataFrame 结构如下

   0     1    2     3         4         5         6      7      8     9 
0  42  2012  106  1200  0.112986 -0.647709 -0.303534  31.73  14.80  1096
1  42  2012  106  1200  0.185159 -0.588728 -0.249392  31.74  14.80  1097
2  42  2012  106  1200  0.199910 -0.547780 -0.226356  31.74  14.80  1096
3  42  2012  106  1200  0.065741 -0.796107 -0.099782  31.70  14.81  1097
4  42  2012  106  1200  0.116718 -0.780699 -0.043169  31.66  14.78  1094
5  42  2012  106  1200  0.280035 -0.788511 -0.171763  31.66  14.79  1094
6  42  2012  106  1200  0.311319 -0.663151 -0.271162  31.78  14.79  1094

其中第4、5、6列实际上是向量的分量。我想在这些列中应用矩阵乘法，即将第 4、5 和 6 列替换为前一个向量与矩阵相乘所得的向量。

我做的是

    DC=[[ .. definition of multiplication matrix .. ]]
    def rotate(vector):
            return dot(DC, vector)
    data[[4,5,6]]=data[[4,5,6]].apply(rotate, axis='columns')

我认为应该可行，但返回的 DataFrame 与原始数据帧完全相同。

我在这里错过了什么？

Answer 1

你的代码是正确的，但是很慢。您可以使用 values 属性获取 ndarray 并使用 dot() 一次转换所有向量：

import numpy as np
import pandas as pd

DC = np.random.randn(3, 3)
df = pd.DataFrame(np.random.randn(1000, 10))
df2 = df.copy()
df[[4,5,6]] = np.dot(DC, df[[4,5,6]].values.T).T

def rotate(vector):
        return np.dot(DC, vector)
df2[[4,5,6]] = df2[[4,5,6]].apply(rotate, axis='columns')
df.equals(df2)

在我的 PC 上，速度大约快 90 倍。

在特定 pandas 列中应用矩阵乘积

Applying matrix product in specific pandas columns

python

matrix

matrix-multiplication

pandas