通过 sklearn.decomposition.PCA 执行 svd ,我怎样才能从中得到 U S V?
Performing svd by sklearn.decomposition.PCA , how can I get the U S V from this?
执行 SVD
来自SVD的方程
A= U x S x V_t
V_t = V 的转置矩阵
(对不起,我不能粘贴原来的等式)
如果我想要U、S、V矩阵,用sklearn.decomposition.PCA怎么得到?
首先,根据矩阵的大小,PCA
的 sklearn 实现并不总是计算完整的 SVD 分解。以下摘自PCA's GitHub reciprocity:
svd_solver : string {'auto', 'full', 'arpack', 'randomized'}
auto :
the solver is selected by a default policy based on `X.shape` and
`n_components`: if the input data is larger than 500x500 and the
number of components to extract is lower than 80% of the smallest
dimension of the data, then the more efficient 'randomized'
method is enabled. Otherwise the exact full SVD is computed and
optionally truncated afterwards.
full :
run exact full SVD calling the standard LAPACK solver via
`scipy.linalg.svd` and select the components by postprocessing
arpack :
run SVD truncated to n_components calling ARPACK solver via
`scipy.sparse.linalg.svds`. It requires strictly
0 < n_components < X.shape[1]
randomized :
run randomized SVD by the method of Halko et al.
此外,它还对数据执行一些操作(参见 here)。
现在,如果您想获得 sklearn.decomposition.PCA
中使用的 U, S, V
,您可以使用 pca._fit(X)
。
例如:
from sklearn.decomposition import PCA
X = np.array([[1, 2], [3,5], [8,10], [-1, 1], [5,6]])
pca = PCA(n_components=2)
pca._fit(X)
打印
(array([[ -3.55731195e-01, 5.05615563e-01],
[ 2.88830295e-04, -3.68261259e-01],
[ 7.10884729e-01, -2.74708608e-01],
[ -5.68187889e-01, -4.43103380e-01],
[ 2.12745524e-01, 5.80457684e-01]]),
array([ 9.950385 , 0.76800941]),
array([[ 0.69988535, 0.71425521],
[ 0.71425521, -0.69988535]]))
但是,如果你只想对原始数据进行SVD分解,我建议使用scipy.linalg.svd
来自SVD的方程
A= U x S x V_t
V_t = V 的转置矩阵 (对不起,我不能粘贴原来的等式)
如果我想要U、S、V矩阵,用sklearn.decomposition.PCA怎么得到?
首先,根据矩阵的大小,PCA
的 sklearn 实现并不总是计算完整的 SVD 分解。以下摘自PCA's GitHub reciprocity:
svd_solver : string {'auto', 'full', 'arpack', 'randomized'}
auto :
the solver is selected by a default policy based on `X.shape` and
`n_components`: if the input data is larger than 500x500 and the
number of components to extract is lower than 80% of the smallest
dimension of the data, then the more efficient 'randomized'
method is enabled. Otherwise the exact full SVD is computed and
optionally truncated afterwards.
full :
run exact full SVD calling the standard LAPACK solver via
`scipy.linalg.svd` and select the components by postprocessing
arpack :
run SVD truncated to n_components calling ARPACK solver via
`scipy.sparse.linalg.svds`. It requires strictly
0 < n_components < X.shape[1]
randomized :
run randomized SVD by the method of Halko et al.
此外,它还对数据执行一些操作(参见 here)。
现在,如果您想获得 sklearn.decomposition.PCA
中使用的 U, S, V
,您可以使用 pca._fit(X)
。
例如:
from sklearn.decomposition import PCA
X = np.array([[1, 2], [3,5], [8,10], [-1, 1], [5,6]])
pca = PCA(n_components=2)
pca._fit(X)
打印
(array([[ -3.55731195e-01, 5.05615563e-01],
[ 2.88830295e-04, -3.68261259e-01],
[ 7.10884729e-01, -2.74708608e-01],
[ -5.68187889e-01, -4.43103380e-01],
[ 2.12745524e-01, 5.80457684e-01]]),
array([ 9.950385 , 0.76800941]),
array([[ 0.69988535, 0.71425521],
[ 0.71425521, -0.69988535]]))
但是,如果你只想对原始数据进行SVD分解,我建议使用scipy.linalg.svd