prcomp 之后 ggplot2 和 autoplot() 的区别?

Difference ggplot2 and autoplot() after prcomp?

我用 autoplot() 绘制了 PCA 图,但我只想在 2 个组周围使用省略号,而不是全部 3 个组。因此我切换到 ggplot。但是,似乎我的轴在 autoplot 和 ggplot 方法之间有所不同。查看p1和p2的区别:

library(ggplot2)
library(ggfortify)
library(tidyr)

x <- iris[1:4]
pc <- prcomp(x)
df <- cbind(pc$x[,1:2], iris[,5]) %>% as.data.frame()
df$PC1 <- as.numeric(df$PC1)
df$PC2 <- as.numeric(df$PC2)
df$V3 <- as.factor(df$V3)

#ggplot method
p1 <- ggplot(df, aes(PC1, PC2, colour = V3)) +
  geom_point(size = 3, aes(shape = V3)) +
  stat_ellipse(geom = "polygon", aes(fill = after_scale(alpha(colour, 0))),
               data = df[df$V3 == "1" | df$V3 == "2",], size = 1) 
p1



#autoplot method
y <- prcomp(x)
x2 <- as.data.frame(cbind(x, iris[,5]))
x2$`iris[, 5]` <- as.factor(x2$`iris[, 5]`)

p2<- autoplot(y, 
              data = x2, 
              colour = 'iris[, 5]', 
              label = F, 
              shape = 'iris[, 5]',
              size = 2)

p2


Created on 2022-02-22 by the reprex package (v2.0.1)

为什么我得到不同的轴?

在自动绘图方法中,主要成分是按比例缩放的,因此要获得相同的结果,您可以这样做:

x <- iris[1:4]
pc <- prcomp(x)
df <- cbind(pc$x[,1:2], iris[,5]) %>% as.data.frame()
df$PC1 <- as.numeric(df$PC1) / (pc$sdev[1] * sqrt(nrow(iris)))
df$PC2 <- as.numeric(df$PC2) / (pc$sdev[2] * sqrt(nrow(iris)))
df$V3 <- as.factor(df$V3)

#ggplot method
p1 <- ggplot(df, aes(PC1, PC2, colour = V3)) +
  geom_point(size = 3, aes(shape = V3)) +
  stat_ellipse(geom = "polygon", aes(fill = after_scale(alpha(colour, 0))),
               data = df[df$V3 == "1" | df$V3 == "2",], size = 1) 
p1