当组件数量选择相同时,sklearn PCA 对输入数组有何影响?
What does the sklearn PCA to the input array when when the number of components is choose to be the same?
例如我们有:
from sklearn.decomposition import PCA
import numpy as np
xx = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
pca = PCA()
pca.fit_transform(xx)
输出:
array([[ 1.38340578, 0.2935787 ],
[ 2.22189802, -0.25133484],
[ 3.6053038 , 0.04224385],
[-1.38340578, -0.2935787 ],
[-2.22189802, 0.25133484],
[-3.6053038 , -0.04224385]])
在这种情况下,我没有减小大小,而是更改了数组...为什么?
PCA 对您的特征进行线性(旋转)变换 space。在你的情况下,
假设特征 1 沿 x
,特征 2 沿 y
,生成的变换与将特征向量旋转 theta
~ 2.565 弧度的角度相同。下面我定义了这样一个旋转矩阵并向您展示了相同的结果:
import numpy as np
def rot_matrix(theta):
# returns rotation matrix through angle theta
rotation_matrix = np.dot(np.array([[np.cos(theta), -
np.sin(theta)], [np.sin(theta), np.cos(theta)]])
return rotation_matrix
theta = 2.565
rot = rot_matrix(theta)
np.dot(rot, xx.T).T
结果是(接近于)PCA 变换的输出:
array([[ 1.38349574, 0.29315446],
[ 2.22182084, -0.25201619],
[ 3.60531658, 0.04113827],
[-1.38349574, -0.29315446],
[-2.22182084, 0.25201619],
[-3.60531658, -0.04113827]])
例如我们有:
from sklearn.decomposition import PCA
import numpy as np
xx = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
pca = PCA()
pca.fit_transform(xx)
输出:
array([[ 1.38340578, 0.2935787 ],
[ 2.22189802, -0.25133484],
[ 3.6053038 , 0.04224385],
[-1.38340578, -0.2935787 ],
[-2.22189802, 0.25133484],
[-3.6053038 , -0.04224385]])
在这种情况下,我没有减小大小,而是更改了数组...为什么?
PCA 对您的特征进行线性(旋转)变换 space。在你的情况下,
假设特征 1 沿 x
,特征 2 沿 y
,生成的变换与将特征向量旋转 theta
~ 2.565 弧度的角度相同。下面我定义了这样一个旋转矩阵并向您展示了相同的结果:
import numpy as np
def rot_matrix(theta):
# returns rotation matrix through angle theta
rotation_matrix = np.dot(np.array([[np.cos(theta), -
np.sin(theta)], [np.sin(theta), np.cos(theta)]])
return rotation_matrix
theta = 2.565
rot = rot_matrix(theta)
np.dot(rot, xx.T).T
结果是(接近于)PCA 变换的输出:
array([[ 1.38349574, 0.29315446],
[ 2.22182084, -0.25201619],
[ 3.60531658, 0.04113827],
[-1.38349574, -0.29315446],
[-2.22182084, 0.25201619],
[-3.60531658, -0.04113827]])