如何绘制 PCA 的前几个值

Question

我有运行一个具有中等大小数据集的主成分分析，但我只想从该分析中可视化一定数量的点，因为它们来自重复观察，我想看看有多接近成对的观察结果在图上是相互的。我已经设置好了，前 18 个人是我想要绘制的人，但我似乎不能只绘制前 18 个点而不只分析前 18 个而不是整个数据集(43个人).

# My data file
TrialsMR<-read.csv("NER_Trials_Matrix_Retrials.csv", row.names = 1)
# I ran the PCA of all of my values (without the categorical variable in col 8)
R.pca <- PCA(TrialsMR[,-8], graph = FALSE)
# When I try to plot only the first 18 individuals with this method, I get an error
fviz_pca_ind(R.pca[1:18,], 
             labelsize = 4, 
             pointsize = 1, 
             col.ind = TrialsMR$Bands, 
             palette = c("red", "blue", "black", "cyan", "magenta", "yellow", "gray", "green3", "pink" ))
# This is the error
Error in R.pca[1:18, ] : incorrect number of dimensions

18个人每人配对，所以只使用9种颜色应该不会出错（我希望）。

谁能帮我绘制整个数据集的 PCA 的前 18 个点？

我的数据框在结构上与此相似

TrialsMR
      Trees Bushes Shrubs Bands
JOHN1     1      4     18  BLUE
JOHN2     2      6     25  BLUE
CARL1     1      3     12 GREEN
CARL2     2      4     15 GREEN
GREG1     1      1     15   RED
GREG2     3     11     26   RED
MIKE1     1      7     19  PINK
MIKE2     1      1     25  PINK

其中每个条带对应于一个经过两次测试的特定个体。

Answer 1

您使用了错误的参数来指定个人。使用select.ind选择需要的人，例如：

data(iris)                                                  # test data

如果您想根据特定的分组标准重命名您的行以便在图中易于识别。例如。让 setosa 位于从 1 开始的系列中，类似于 100-199，类似地 versicolor 在 200-299 和 virginica 在 300-399。在 PCA.

之前执行

new_series <- c(101:150, 201:250, 301:350)                # there are 50 of each 
rownames(iris) <- new_series
R.pca <- prcomp(iris[,1:4],scale. = T)                    # pca

library(factoextra)

fviz_pca_ind(X= R.pca, labelsize = 4, pointsize = 1, 
             select.ind= list(name = new_series[1:120]),  # 120 out of 150 selected
             col.ind = iris$Species ,
             palette = c("blue", "red", "green" ))

始终在使用新函数之前先参考 R 文档。

R documentation: fviz_pca {factoextra}

X
an object of class PCA [FactoMineR]; prcomp and princomp [stats]; dudi and pca [ade4]; expOutput/epPCA [ExPosition].

select.ind, select.var
a selection of individuals/variables to be drawn. Allowed values are NULL or a list containing the arguments name, cos2 or contrib

对于您的特定虚拟数据，应该这样做：

 R.pca <- prcomp(TrailsMR[,1:3], scale. = TRUE)

 fviz_pca_ind(X= R.pca, 
              select.ind= list(name = row.names(TrialsMR)[1:4]),  # 4 out of 8
              pointsize = 1, labelsize = 4,
              col.ind = TrialsMR$Bands,
              palette = c("blue", "green" )) + ylim(-1,1)

虚拟数据：

TrialsMR <- read.table( text = "Trees Bushes Shrubs Bands
JOHN1     1      4     18  BLUE
JOHN2     2      6     25  BLUE
CARL1     1      3     12 GREEN
CARL2     2      4     15 GREEN
GREG1     1      1     15   RED
GREG2     3     11     26   RED
MIKE1     1      7     19  PINK
MIKE2     1      1     25  PINK", header = TRUE)

如何绘制 PCA 的前几个值

How do you plot the first few values of a PCA

r

pca