确定 PCA 的 n_components,使得解释的方差比为 0.99

Determine n_components of PCA such that the explained variance ratio is 0.99

如何轻松确定 n_components 应该用于 Scikit_Learn 的 PCA?

我个人使用如下

wanted_explained_variance_ratio = 0.99
steps_down = 2
wanted_n_components = X_train.shape[1]
first_time = True

for i in range(X_train.shape[1]-1, 1, -steps_down):
  total_var_ratio = round(np.sum(PCA(n_components=i).fit(X_train).explained_variance_ratio_), 5)
  print('i =', i, 'with a variance ratio of', total_var_ratio)
  if total_var_ratio < wanted_explained_variance_ratio and first_time:
    wanted_n_components = i + steps_down
    first_time = False
    # break

print("We should set n_components to: ", wanted_n_components)

预期输出

i = 28 with a variance ratio of 0.99975
i = 26 with a variance ratio of 0.99901
i = 24 with a variance ratio of 0.99807
i = 22 with a variance ratio of 0.99699
i = 20 with a variance ratio of 0.99574
i = 18 with a variance ratio of 0.99428
i = 16 with a variance ratio of 0.99195
i = 14 with a variance ratio of 0.98898
i = 12 with a variance ratio of 0.98534
i = 10 with a variance ratio of 0.98073
i = 8 with a variance ratio of 0.97405
i = 6 with a variance ratio of 0.96544
i = 4 with a variance ratio of 0.9539
i = 2 with a variance ratio of 0.93572
we should set n_components to:  16