无法对数据集进行 运行 PCA

Unable to run PCA on a dataset

我正在尝试 运行 贷款数据集上的 PCA - 查找 test here and train

代码片段如下,

from sklearn.decomposition import PCA
pca = PCA(n_components = 2)
X_train = pca.fit_transform(X_train)
X_test = pca.transform(X_test)
explained_variance = pca.explained_variance_ratio_

但是,在 运行ning 上,我收到以下错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-102-829bdba63de3> in <module>
      3 pca = PCA(n_components = 2)
      4 X_train = pca.fit_transform(X_train)
----> 5 X_test = pca.transform(X_test)
      6 explained_variance = pca.explained_variance_ratio_

C:\Anaconda\lib\site-packages\sklearn\decomposition\base.py in transform(self, X)
    127         X = check_array(X)
    128         if self.mean_ is not None:
--> 129             X = X - self.mean_
    130         X_transformed = np.dot(X, self.components_.T)
    131         if self.whiten:

ValueError: operands could not be broadcast together with shapes (185,112) (2,) 

有人可以帮我解决这个问题吗? 我不知道哪里错了。

做PCA只需要:

import numpy as np
from sklearn.decomposition import PCA
X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
pca = PCA(n_components=2)
pca.fit(X) 

也许你应该在训练上放置标签,加入测试和训练,然后进行 PCA。