如何解释主成分数以确定变异百分比 (Python)?

How to interpret Principal Component numbers to determine % of variation (Python)?

我正在尝试确定有多少主成分可以解释超过 90% 的变异。我有以下内容:

from sklearn.decomposition import PCA
pca = PCA(n_components=11)
pca.fit_transform(X)

print(pca.explained_variance_, '\n\n') ##Line A

print(pca.explained_variance_ratio_) ##Line B

这输出:

[1.79594388e+04 6.33546080e+02 4.45515520e+02 1.75087416e+02
 9.27041405e+01 4.09510643e+01 1.58667003e+01 6.04190503e+00
 3.33657900e+00 4.48917873e-01 1.06491531e-32] 


[9.27037479e-01 3.27026344e-02 2.29967979e-02 9.03773211e-03
 4.78523932e-03 2.11382838e-03 8.19013667e-04 3.11873465e-04
 1.72228866e-04 2.31724219e-05 5.49692234e-37]

我不确定是使用 Lina A 还是 Line B 来确定解释 90% 以上变异的主成分的数量。我该如何解读这些数字?

根据 documentation,您需要行 B。所有比率的总和为 1.0。仅使用第一个分量将解释 92.7% 的方差,而使用前两个将导致 in/explain 几乎 96% 的方差。

line_b = [9.27037479e-01, 3.27026344e-02, 2.29967979e-02, 9.03773211e-03,
4.78523932e-03, 2.11382838e-03, 8.19013667e-04, 3.11873465e-04,
1.72228866e-04, 2.31724219e-05, 5.49692234e-37]

print(f"Percentage first component = {line_b[0]*100}")
print(f"Percentage first and second component = {sum(line_b[0:2])*100}")

输出:

Percentage first component = 92.7037479
Percentage first and second component = 95.97401134