如何方便的对较少的分量进行PCA逆变换?
How to conveniently perform PCA inverse transform on lesser number of components?
我正在探索数据结构并绘制按成分解释的方差。因此,我使用等于维度数的组件数执行 PCA。有没有一种方法可以使用更少的组件来执行逆变换?
像
data = np.random.rand(100, 10) # data of size (N_objects, n_dim)
pca = sklearn.decomposition.PCA(n_dim)
transformed = pca.fit_transform(data)
# then I want to see restoration by different numbers of components
new_data_1 = pca.inverse_transform(transformed, use_components = n_dim // 2)
new_data_2 = pca.inverse_transform(transformed, use_components = n_dim // 3)
new_data_3 = pca.inverse_transform(transformed, use_components = n_dim // 4)
问题是,inverse_transform
方法没有参数use_components
,所以我想知道是否有办法优雅地做这样的事情?或者我每次都必须用不同数量的组件重新训练 PCA
对象?
一种可能的方法是有选择地将分量向量归零:
data = get_some_data() # data of size (N_objects, n_dim)
pca = sklearn.decomposition.PCA(n_dim)
transformed = pca.fit_transform(data)
all_components = pca.components_.copy()
to_zero = np.arange(n_dim//2, n_dim)
pca.components_[to_zero] = np.zeros_like(pca.components_[to_zero])
new_data_1 = pca.inverse_transform(transformed)
# restore original components
pca.components_ = all_components.copy()
# repeat with the other to_zero values
注意:重要的是将矩阵末尾向上的向量清零(PCA 根据解释的方差对向量进行排序)
可以将变换后的数据,将最后n个分量设为0,再进行逆变换。这里有一个可重现的例子。
from numpy.random import rand
from sklearn.decomposition import PCA
# PCA transform
data = rand(100, 10)
n_dim = data.shape[1]
pca = PCA(n_dim)
transformed = pca.fit_transform(data)
# Inverse PCA
def inverse_pca(pca_data, pca, remove_n):
transformed = pca_data.copy()
transformed[:, -remove_n:] = 0
return pca.inverse_transform(transformed)
new_data = inverse_pca(transformed, pca, 3)
我正在探索数据结构并绘制按成分解释的方差。因此,我使用等于维度数的组件数执行 PCA。有没有一种方法可以使用更少的组件来执行逆变换? 像
data = np.random.rand(100, 10) # data of size (N_objects, n_dim)
pca = sklearn.decomposition.PCA(n_dim)
transformed = pca.fit_transform(data)
# then I want to see restoration by different numbers of components
new_data_1 = pca.inverse_transform(transformed, use_components = n_dim // 2)
new_data_2 = pca.inverse_transform(transformed, use_components = n_dim // 3)
new_data_3 = pca.inverse_transform(transformed, use_components = n_dim // 4)
问题是,inverse_transform
方法没有参数use_components
,所以我想知道是否有办法优雅地做这样的事情?或者我每次都必须用不同数量的组件重新训练 PCA
对象?
一种可能的方法是有选择地将分量向量归零:
data = get_some_data() # data of size (N_objects, n_dim)
pca = sklearn.decomposition.PCA(n_dim)
transformed = pca.fit_transform(data)
all_components = pca.components_.copy()
to_zero = np.arange(n_dim//2, n_dim)
pca.components_[to_zero] = np.zeros_like(pca.components_[to_zero])
new_data_1 = pca.inverse_transform(transformed)
# restore original components
pca.components_ = all_components.copy()
# repeat with the other to_zero values
注意:重要的是将矩阵末尾向上的向量清零(PCA 根据解释的方差对向量进行排序)
可以将变换后的数据,将最后n个分量设为0,再进行逆变换。这里有一个可重现的例子。
from numpy.random import rand
from sklearn.decomposition import PCA
# PCA transform
data = rand(100, 10)
n_dim = data.shape[1]
pca = PCA(n_dim)
transformed = pca.fit_transform(data)
# Inverse PCA
def inverse_pca(pca_data, pca, remove_n):
transformed = pca_data.copy()
transformed[:, -remove_n:] = 0
return pca.inverse_transform(transformed)
new_data = inverse_pca(transformed, pca, 3)