计算类型之间的相关矩阵
Calculate correlation matrix between types
我有数据框 df
,其中包括如下 3 列(制表符分隔):
X Y types
0.3422 0.3214 pen
-0.1784 0.8621 pen
0.9932 0.1347 pencil
0.2847 -0.7634 pen
-0.6548 -0.2981 ruler
0.4792 0.3782 pencil
0.9231 -0.2949 ruler
输出将是这样的相关矩阵:
pen pencil ruler
pen C1 C2 C3
pencil C4 C5 C6
ruler C7 C8 C9
我试过 .corr()
但它不能正常工作 df 的结构
注:C1为笔-笔之间的相关值,C2为笔-铅笔之间的相关值,依此类推
有什么帮助吗?
IIUC,你可以这样做:
res = df.groupby('types').mean().T.corr()
输出
types pen pencil ruler
types
pen 1.0 1.0 1.0
pencil 1.0 1.0 1.0
ruler 1.0 1.0 1.0
您可以根据需要更改关联方法,例如:
import numpy as np
res = df.groupby('types').mean().T.corr(method=np.dot)
print(res)
输出
types pen pencil ruler
types
pen 1.000000 0.145973 -0.021464
pencil 0.145973 1.000000 0.022724
ruler -0.021464 0.022724 1.000000
默认方法将是皮尔逊相关,来自 上的 documentation 方法 :
method{‘pearson’, ‘kendall’, ‘spearman’} or callable Method of
correlation:
pearson : standard correlation coefficient
kendall : Kendall Tau correlation coefficient
spearman : Spearman rank correlation
callable: callable with input two 1d ndarrays and returning a float.
Note that the returned matrix from corr will have 1 along the
diagonals and will be symmetric regardless of the callable’s behavior.
New in version 0.24.0.
我有数据框 df
,其中包括如下 3 列(制表符分隔):
X Y types
0.3422 0.3214 pen
-0.1784 0.8621 pen
0.9932 0.1347 pencil
0.2847 -0.7634 pen
-0.6548 -0.2981 ruler
0.4792 0.3782 pencil
0.9231 -0.2949 ruler
输出将是这样的相关矩阵:
pen pencil ruler
pen C1 C2 C3
pencil C4 C5 C6
ruler C7 C8 C9
我试过 .corr()
但它不能正常工作 df 的结构
注:C1为笔-笔之间的相关值,C2为笔-铅笔之间的相关值,依此类推
有什么帮助吗?
IIUC,你可以这样做:
res = df.groupby('types').mean().T.corr()
输出
types pen pencil ruler
types
pen 1.0 1.0 1.0
pencil 1.0 1.0 1.0
ruler 1.0 1.0 1.0
您可以根据需要更改关联方法,例如:
import numpy as np
res = df.groupby('types').mean().T.corr(method=np.dot)
print(res)
输出
types pen pencil ruler
types
pen 1.000000 0.145973 -0.021464
pencil 0.145973 1.000000 0.022724
ruler -0.021464 0.022724 1.000000
默认方法将是皮尔逊相关,来自 上的 documentation 方法 :
method{‘pearson’, ‘kendall’, ‘spearman’} or callable Method of correlation:
pearson : standard correlation coefficient
kendall : Kendall Tau correlation coefficient
spearman : Spearman rank correlation
callable: callable with input two 1d ndarrays and returning a float. Note that the returned matrix from corr will have 1 along the diagonals and will be symmetric regardless of the callable’s behavior.
New in version 0.24.0.