Python: 获取两个多维数组的点积

Python: getting dot product for two multidimensional arrays

我意识到 numpy.dot 不处理多维矩阵。我的数据如下所示。我想让所有列(总共 42 个)计算除第一列以外的点积。

这是我的数据的样子(数据简化) 数据 1:

0   4   6
-0.276  4403    4403
-0.138  4640    4640
0   0   0
0.138   12  0
0.276   0   0
0.414   0   0
0.552   0   0
0.69    0   0
0.828   0   12
0.966   0   0
1.104   0   12
1.242   0   0
1.38    0   0
1.518   0   0
1.656   0   0
1.794   0   0
1.932   0   0
2.07    0   0
2.208   0   0
2.346   0   0
2.484   0   0
2.622   0   12
2.76    0   0
2.898   0   0
3.036   0   0
3.174   0   0
3.312   0   0
3.45    0   0
3.588   0   0
3.726   0   0
3.864   12  0
4.002   0   0
4.14    0   0
4.278   12  0
4.416   0   0
4.554   0   12
4.692   0   0
4.83    0   0
4.968   0   0
5.106   0   0
5.244   0   0
5.382   12  0
5.52    0   0
5.658   0   0
5.796   127 60
5.934   357 275
6.072   1882    2144
6.21    6726    6609
6.348   9398    11180
6.486   12784   18389
6.624   15863   20111
6.762   6739    10202
6.9 1684    1921
7.038   249 376
7.176   47  103
7.314   0   26
7.452   17  0
7.59    0   0
7.728   0   0
7.866   0   0
8.004   0   0
8.142   0   0
8.28    0   0
8.418   0   0
8.556   0   0
8.694   0   0
8.832   0   0
8.97    0   0
9.108   0   0
9.246   0   0
9.384   0   0
9.522   0   0
9.66    0   0
9.798   0   0
9.936   0   0
10.074  0   0
10.212  0   0
10.35   0   12
10.488  0   0
10.626  0   0
10.764  0   0
10.902  0   0
11.04   0   0
11.178  0   0
11.316  0   0
11.454  0   0
11.592  0   0
11.73   0   0
11.868  0   0
12.006  0   0
12.144  0   0
12.282  0   0
12.42   0   0
12.558  0   0
12.696  12  0
12.834  0   0
12.972  0   0
13.11   0   0
13.248  0   0
13.386  12  0
13.524  0   0
13.662  0   12
13.8    0   0
13.938  0   0
14.076  0   0
14.214  0   0
14.352  0   0
14.49   0   0
14.628  12  0
14.766  0   0
14.904  12  0
15.042  0   0
15.18   0   0
15.318  0   0
15.456  0   0
15.594  0   0
15.732  0   0
15.87   0   0
16.008  0   0
16.146  0   0
16.284  0   0
16.422  0   0
16.56   12  0
16.698  0   0
16.836  0   0
16.974  0   0
17.112  0   0
17.25   0   0
17.388  0   0
17.526  0   0
17.664  0   12
17.802  0   0
17.94   0   0
18.078  0   0
18.216  0   0
18.354  0   0
18.492  0   0
18.63   12  0
18.768  0   0
18.906  0   0
19.044  0   0
19.182  0   0
19.32   0   0
19.458  0   0
19.596  0   0
19.734  0   0
19.872  0   0
20.01   0   0
20.148  0   12
20.286  12  0
20.424  0   12
20.562  0   0
20.7    0   0
20.838  0   0
20.976  0   0
21.114  0   0
21.252  0   0
21.39   0   12
21.528  0   0
21.666  0   0
21.804  12  0
21.942  0   0
22.08   0   0
22.218  0   0
22.356  0   0
22.494  0   0
22.632  0   0
22.77   0   0
22.908  0   0
23.046  0   0
23.184  0   0
23.322  0   0
23.46   12  0
23.598  0   12
23.736  0   0
23.874  0   0
24.012  0   0
24.15   0   0
24.288  0   0
24.426  0   0
24.564  0   0
24.702  0   0
24.84   0   0
24.978  0   0
25.116  0   0
25.254  0   0
25.392  0   0
25.53   0   0
25.668  0   0
25.806  12  0
25.944  12  0
26.082  0   0
26.22   0   0
26.358  0   12
26.496  0   0
26.634  0   0
26.772  0   0
26.91   0   0
27.048  13  0
27.186  0   0
27.324  0   0
27.462  0   0

数据 2:

0   4   6
-0.276  4400    4400
-0.138  4750    4750
0   0   0
0.138   12  0
0.276   0   0
0.414   0   12
0.552   0   0
0.69    0   25
0.828   0   0
0.966   12  13
1.104   0   0
1.242   0   12
1.38    0   0
1.518   12  0
1.656   0   0
1.794   0   12
1.932   0   0
2.07    12  0
2.208   0   0
2.346   0   0
2.484   12  0
2.622   0   0
2.76    24  0
2.898   0   0
3.036   0   0
3.174   12  0
3.312   0   0
3.45    0   0
3.588   0   12
3.726   39  0
3.864   0   12
4.002   0   0
4.14    0   12
4.278   0   0
4.416   0   0
4.554   0   0
4.692   0   0
4.83    0   0
4.968   0   0
5.106   0   0
5.244   0   0
5.382   0   0
5.52    0   12
5.658   0   0
5.796   0   0
5.934   0   0
6.072   43  46
6.21    6711    11323
6.348   91043   116679
6.486   241572  307822
6.624   250588  309749
6.762   105123  139651
6.9 16143   21264
7.038   2521    3648
7.176   1042    1022
7.314   576 910
7.452   482 552
7.59    229 416
7.728   210 227
7.866   120 149
8.004   69  55
8.142   47  0
8.28    26  65
8.418   0   20
8.556   0   0
8.694   0   0
8.832   0   12
8.97    12  38
9.108   0   0
9.246   18  0
9.384   0   0
9.522   0   13
9.66    0   0
9.798   0   18
9.936   16  0
10.074  12  0
10.212  0   0
10.35   12  0
10.488  0   0
10.626  0   23
10.764  0   0
10.902  0   0
11.04   20  0
11.178  0   0
11.316  0   0
11.454  0   0
11.592  0   0
11.73   0   12
11.868  14  12
12.006  0   0
12.144  0   0
12.282  0   0
12.42   0   0
12.558  0   12
12.696  0   0
12.834  0   0
12.972  12  0
13.11   0   0
13.248  0   0
13.386  0   18
13.524  0   0
13.662  12  0
13.8    12  0
13.938  13  0
14.076  0   0
14.214  0   0
14.352  0   0
14.49   0   0
14.628  24  0
14.766  0   15
14.904  0   16
15.042  0   12
15.18   12  0
15.318  0   12
15.456  0   0
15.594  0   0
15.732  14  13
15.87   0   23
16.008  0   0
16.146  0   0
16.284  0   16
16.422  0   12
16.56   0   0
16.698  0   0
16.836  0   0
16.974  0   13
17.112  0   0
17.25   0   0
17.388  16  0
17.526  0   12
17.664  0   0
17.802  0   0
17.94   0   12
18.078  0   0
18.216  0   0
18.354  0   19
18.492  0   0
18.63   0   0
18.768  0   12
18.906  0   0
19.044  0   12
19.182  0   12
19.32   0   0
19.458  0   0
19.596  12  24
19.734  0   0
19.872  0   0
20.01   0   0
20.148  0   0
20.286  0   0
20.424  0   12
20.562  12  0
20.7    0   0
20.838  0   0
20.976  0   0
21.114  0   0
21.252  0   0
21.39   0   12
21.528  12  12
21.666  0   0
21.804  12  0
21.942  0   0
22.08   0   0
22.218  0   0
22.356  0   12
22.494  0   0
22.632  12  0
22.77   0   0
22.908  0   0
23.046  12  0
23.184  0   0
23.322  12  0
23.46   0   0
23.598  13  16
23.736  24  17
23.874  0   0
24.012  12  0
24.15   0   0
24.288  0   0
24.426  12  0
24.564  0   0
24.702  0   0
24.84   0   0
24.978  0   0
25.116  0   0
25.254  0   0
25.392  14  12
25.53   25  0
25.668  0   12
25.806  0   0
25.944  0   15
26.082  0   0
26.22   12  0
26.358  0   0
26.496  0   0
26.634  0   0
26.772  27  0
26.91   0   12
27.048  0   22
27.186  0   0
27.324  0   0
27.462  0   0

然后我有下面的代码

import pandas as pd
import numpy as np

first_y= np.array(firt_df.iloc[:,1:])
second_y= np.array(second_df.iloc[:,1:])

#dot product
dot_product_both=np.dot(first_y, second_y)

使用上面的代码,我得到以下错误

shapes (200,42) and (200,42) not aligned: 42 (dim 1) != 200 (dim 0)

我收到此错误是因为 np.dot 无法处理大于一维的数组。

我在想也许可以创建一个 numpy 函数,以便它可以解构我的数据框并一一处理所有内容。在我这样做之前,我想看看你们中是否有人有聪明的方法来解决这个问题..

我想要的结果是..(对于第二列) 4403*4400+4640*4750+0*0+12*12....

IIUC,你要

(x*y).sum(axis=0)

转换为 4403*4400+4640*4750+0*0+12*12 + ...

看看这个可重现的例子:

x = np.array([
    [1,2,3],
    [4,5,6]
])
y = np.array([
    [1,10,1],
    [2,2,3]
])

然后

>>> (x*y)

array([[ 1, 20,  3],
       [ 8, 10, 18]])

>>> (x*y).sum(axis=1)
array([24, 36])

>>> (x*y).sum(axis=0)
array([ 9, 30, 21])

请注意,这些值实际上只是 np.dot 产品

对角线
>>> np.dot(x,y.T)
array([[24, 15],
       [60, 36]])

>>> np.dot(x.T,y)
array([[ 9, 18, 13],
       [12, 30, 17],
       [15, 42, 21]])